Binary floating-point arithmetic for microprocessor systems

Arithmétique binaire en virgule flottante pour systèmes à microprocesseurs

General Information

Status

Withdrawn

Publication Date

31-Jan-1989

Withdrawal Date

31-Jan-1989

ICS

35.160 - Microprocessor systems

Technical Committee

ISO/IEC JTC 1/SC 25 - Interconnection of information technology equipment

Drafting Committee

ISO/IEC JTC 1/SC 25 - Interconnection of information technology equipment

Current Stage

9599 - Withdrawal of International Standard

Completion Date

14-Feb-2012

Ref Project

Relations

Revised

ISO/IEC/IEEE 60559:2011 - Information technology — Microprocessor Systems — Floating-Point arithmetic

Effective Date

23-Feb-2012

Buy Standard

Standard

IEC 559:1989

English language

12 pages

sale 15% off

Preview

sale 15% off

Preview

Standards Content (Sample)

IEC 559:1989

CEI
NORME
IEC
INTERNATIONALE
559
INTERNATIONAL
Deuxihme bdition
Second edition
STANDARD
1989-01
I
Arithmbtique binaire en virgule flottante
pour syst6mes 6 microprocesseurs
Binary floating-point arithmetic
for microprocessor systems
Num&o de rkfhence
Reference number
CEI/IEC 559: 1989

---------------------- Page: 1 ----------------------
R6vision de la prksente publication Revision of this publication
The technical content of I EC publications is kept under con-
Le contenu technique des publications de la CE I est constam-
ment revu par la Commission afin d’assurer qu’il reflete bien l’etat stant review by the I EC, thus ensuring that the content reflects
actuel de la technique. current technology.
Les renseignements relatifs a ce travail de revision, a l’etablis- Information on the work of revision, the issue of revised edi-
sement des editions revisees et aux mises a jour peuvent etre tions and amendment sheets may be obtained from I EC National
Committees and from the following I EC sources:
obtenus aupres des Comites nationaux de la C E I et en consultant
les documents ci-dessous :
I E C Bulletin
Bulletin de la C E I 0
a
Annuaire de la C E I I EC Yearbook
l 0
l Catalogue des publications de la C E I l Catalogue of I EC Publications
Publie annuellement Published yearly
Terminologie Terminology
En ce qui conceme la terminologie g&&-ale, le lecteur se repor- For general terminology, readers are referred to I EC Publi-
tera a la Publication 50 de la C E I: Vocabulaire Electrotechnique cation 50 : International Electrotechnical Vocabulary (IEV), which
International (VEI), qui est etablie sous forme de chapitres &pares is issued in the form of separate chapters each dealing with a
specific field, the General Index being published as a separate
traitant chacun d’un sujet defini, 1’Index general etant publie sepa-
rement. Des details complets sur le VEI peuvent etre obtenus sur booklet. Full details of the IEV will be supplied on request.
demande.
Les termes et definitions figurant dans la presente publication The terms and definitions contained in the present publication
have either been taken from the IEV or have been specifically
ont ete soit repris du VEI, soit specifiquement approuves aux fins
de cette publication. approved for the purpose of this publication.
Symboles graphiques et littkraux Graphical and letter symbols
For graphical symbols, and letter symbols and signs approved
Pour les symboles graphiques, symboles litteraux et signes
by the I EC for general use, readers are referred to:
d’usage general approuves par la C E I, le lecteur consultera :
-
la Publication 27 de la CE I : Symboles litteraux a utiliser en - I EC Publication 27 : Letter symbols to be used in electrical
technology;
electrotechnique ;
- -
la Publication 6 17 de la CE I: Symboles graphiques pour I EC Publication 6 17 : Graphical symbols for diagrams.
schemas.
Les symboles et signes contenus dans la presente publication ont The symbols and signs contained in the present publication
ete soit repris des Publications 27 ou 6 17 de la CE I, soit specifi- have either been taken from I EC Publications 27 or 6 17, or have
quement approuves aux fins de cette publication. been specifically approved for the purpose of this publication.
Publications de la C E I ktablies par le meme I EC publications prepared by the same
Cornit d’Etudes Technical Committee
The attention of readers is drawn to the back cover, which lists
L’attention du lecteur est attiree sur le deuxieme feuillet de la
I EC publications issued by the Technical Committee which has
couverture, qui enumere les publications de la C E I preparees par
le Comite d’Etudes qui a etabli la presente publication. prepared the present publication.

---------------------- Page: 2 ----------------------
NORME
CEI
INTERNATIONALE
IEC
559
INTERNATIONAL
Deuxihme bdition
STANDARD Second edition
1989-01
Arithmbtique binaire en virgule flottante
pour systltmes 6 microprocesseurs
Binary floating-point arithmetic
for microprocessor systems
0 CEI 1989 Droits de reproduction rkervk - Copyright - all rights reserved
Aucune par-tie de cette publication ne peut etre reproduite ni utilisee No part of this publication may be reproduced or utilized in any form
sous quelque forme que ce soit et par aucun procede, electronique or by any means, electronic or mechanical, including photocopying
ou mecanique, y compris la photocopie et les microfilms, sans I’ac-
and microfilm, without permission in writing from the publisher.
cord Bcrit de I’editeur.
Bureau Central de la Commission Electrotechniquelb lnternationale 3, rue de Varemb6 Genive, Suisse
Commission Electrotechnique lnter~nationale CODE PRIX
PRICE CODE 18
International Electrotechnical Commission
Mem&yHapoAHaH 3neHTpoText-wecKaq KOMMCWF~
Pour p/ix, voir catalogue en vi&eur
For price, see current catalogue

---------------------- Page: 3 ----------------------
559 @ CEI
-2.
SOMMAI RE
Pages
4
PREAMBULE .
4
PREFACE .
.
Articles
6
...........................................
1 . Domaine d’application
6
Objectifs de realisation .
1.1
6
1.2 inclusions .
-6
1.3 Exclusions .
6
l Definitions .
2
10
3 . Formats .
12
.......................................
3.1 Ensembles de valeurs
14
............. .I ..............................
3.2 Formats de base
’ 16
...........................................
3.3 Formats etendus
16
... .: ..............................
3.4 Combinaisons de formats
18
4 . Arrondi .
18
......................................
4.1 Arrondi au plus pres
18
..........................................
4.2 Arrond’is orient&
18
Precision d’arrondi .
4.3
20
.....................................................
5 . Operations
20
5.1 Arithmetique .
22
Racine carree .
5.2
22
..................
Conversions des formats virgule flottante
5.-3
22
................
5.4 Conversion entre virgule flottante et entier
5.5 Arrondi de nombres en virgule flottante vers
22
.....................................
une valeur entiere . i
22
................................
5.6 Conversion binaire-dkimale
26
Comparaison .
5.7
30
..............................
6 . Infini, non-nombres et zero sign6
30
Arithmetique de I’infini .
6.1
30
..........................
6.2 Operations avec des non-nombres
32
Bit de signe .
6.3
32
Exceptions .
7 .
32
............. .: ........................
7.1 Opbrations invalides
34
..........................................
7.2 Division par zero
34
...................................
7.3 Depassement de capacite
36
.........................
7.4 Depassement de capacite inferieur
38
...............................................
7.5 Inexactitude
38
Deroutements .
8 .
40
......................
8.1 Routine de traitement de deroutement
40
Precedence .
8.2
42
. . . . . . . . . . . . . . . . . . .
Fonctions et predicats recommandes
ANNEXE A -

---------------------- Page: 4 ----------------------
3 -
559 @ IEC
CONTENTS
Page
5
.........................................................
FOREWORD
5
PREFACE .
Clause
7
1 . Scope .
7
1.1 implementation objectives
7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Inclusions .
7
1.3 Exclusions .
7
.....................................................
2 . Definitions
11
........................................................
3 . Formats
I
13
.............................................
3.1 Sets of values
15
3.2 Basic formats
17
3.3 Extended forma;~‘::::::::::::::::::::::::::::::::::::::::::
17
...................................
3.4 Combinations of formats
19
4 . Rounding .
19
...........................................
4.1 Round to nearest
19
.........................................
4.2 Directed roundings
19
.......................................
4.3 Rounding precision
:. . 21
.....................................
5 . Operations
21
5.1 Arithmetic .
23
5.2 Square root .
.......................... 23
5.3 Floating-point format conversions
............ 23
5.4 Conversions between floating-point and integer
............. 23
5.5 Round floating-point number to integral value
........................ .: .... 23
5.6 Binary f-+ decimal conversion
27
................................................
5.7 Comparison
31
.................................
6 . Infinity, NaNs and signed zero
31
Infinity arithmetic .
6.1
31
6.2. Operations with NaNs .
e
33
6.3 The sign bit .
33
Exceptions .
7 .
33
Invalid operations .
7.1
35
Division by zero .
7.2
35
Overflow .
7.3
37
Underflow .
7.4
39
7.5 Inexact .
39
. Traps .
8
41
8.1 Trap handler .
41
8.2 Precedence .
_’
. . . . . . . . . . . . . . 43
Recommended functions and predicates
APPENDIX A -

---------------------- Page: 5 ----------------------
-4, 559 @ CEI
COMMISSION ELECTROTECHNIQUE INTERNATIONALE
ARITHMETIQUE BINAIRE EN VIRGULE FLOTTANTE
POUR SYSTEMES A MICROPROCESSEURS
PREAMBUL E
1) Les dkisions ou accords officiels de la CEI en ce qui concerne les
questions techniques, pr6par.k par des Cornit& d’Etudes oii sont reprh-
sent& tous les Comitk nationaux s'int&essant B ces questions,
expriment dans la plus grande mesure possible un accord international
sur les sujets examink.
2) Ces dkisions constituent des recommandations internationales et sont
agrGes comme telles par les ComitQs nationaux.
31 Dans le but d’encourager l'unification internationale, la CEI exprime
le voeu que tous les ComitBs nationaux adoptent dans leurs rggles
nationales le texte de la recommandation de la CEI, dans la mesure oCi
les conditions nationales le permettent. Toute divergence entre la
recommandation de la CEI et la r&gle nationale correspondante doit,
dans la mesure du possible, iitre indiqu6e en termes clairs dans cette
dernigre.
PREFACE
La pr6sente norme a 36 6tablie par le Sous-Cornit 47B: Syst&mes G
microprocesseurs, du Comitk d’Etudes no 47 de la CEI: Dispositifs i
semiconducteurs. (Ce Sous-Comith a 6th repris par l’ISO/IEC JTC 1.)
Cette deuxi&me hdition de la Publication 559 remplace la premiere kdition
parue en 1982.
Le texte de cette norme est issu des documents suivants:
R&gle des Six Mois Rapport de vote
47BcBCJ19 ' 47B(BCK?6
Le rapport de vote indiquh dans I6 tableau ci-dessus donne toute infor-
mation sur le vote ayant abouti 6 I’approbation de cette norme.

---------------------- Page: 6 ----------------------
559 @ IEC -5-
INTERNATIONAL ELECTROTECHNICAL COMMISSION
BINARY FLOATING-POINT ARITHMETIC
FOR MICROPROCESSOR SYSTEMS
FOREWORD ~
1) The formal decisions or agreements of the IEC on technical matters,
prepared by Technical Committees on which all the National Committees
having a special interest therein are represented, express7 as nearly
as possible, an international consensus of opinion on the subjects
dealt with.
21 They have the form of recommendations for international use and they
are accepted by the National Committees in that sense.
3) In order to promote international unification, the IEC expresses the
wish that all National Committees should adopt the text of the IEC
recommendation for their national rules in so far as national
conditions will permit. Any divergence between the IEC recommendation
and the corresponding national rules should, as far as possible, be
clearly indicated in the latter.'
PREFACE
This standard has been prepared by Sub-Committee 478: Microprocessor
of IEC Technical Committee No. 47: Semiconductor devices. (This
systems,
Sub-Committee has been taken over by lSO/IEC JTC 1.)
This second edition of IEC Publication 559 replaces ‘the first ed,ition
issued in 1982.
The text of this standard is based on the following documents:
Report on Voting
Six Months’ Rule
47BK0126
47BK0119
Full information on the voting for the approval of this standard can be
found in the Voting Report indicated in the above table.

---------------------- Page: 7 ----------------------
-6- 559 @ CEI
ARITHMETIQUE BINAIRE EN VIRGULE FLOTTANTE
POUR SYSTEMES A MICROPROCESSEURS
1 . Domaine d’application
1.1 Objectifs de rkalisation
L’objectif est qu’une rkalisation d’un systgme 5 virgule ‘flottante
conforme a la presente norme puisse etre effectuee entierement par
logiciel, entierement par materiel, ou par une combinaison quelconque
de logiciel et de materiel. C’est I’environnement que le programmeur ou
I’utilisateur voit qui est conforme ou non conforme a cette norme. Les
composants materiels qui necessitent un support logiciel pour devenir
conformes ne doivent pas etre qualifies de conformes independamment
d’un tel logiciel.
I
1.2 Inclusions
Cette norme spkif ie:
1) les formats de base et etendu des nombres en virgule flottante;
2) les operations d’addition, de soustraction, de multiplication, de
division, de calcul d’une racine carree, du calcul d”un reste et de
compa raison ;
3) ,. les conversions entre nombres entiers et nombres en virgule
flottante;
/
4) les conversions entre differents formats en virgule flottante;
5) les conversions entre les nombres en virgule flottante en format de
base et les chaines dkimales, et
6) la detection et le traitement des conditions d’exception pour les
nombres en virgule flottante y compris les non-nombres (“NaN”).
,
1.3 Exclusions
Cette norme ne specific pas:
.
/
1) les formats des chaines dkimales et des entiers;
I
2) I’interp&ation des champs de signe et de mantisse des non-
nombres (‘INaN”), ou I
’ 1
3) les conversions de binaire a decimal et reciproquement pour les
formats etendus.
. Dhfinitions
2
Exposant avec exce’dent
Somme de I’exposant et d’une constante (excedent ou biais) choisie
de maniere a rendre non negatif le domaine de I’exposant avec
excbdent.

---------------------- Page: 8 ----------------------
- 7 -
559 @ I EC
BINARY FLOATING-POINT ARITHMETIC
FOR MICROPROCESSOR SYSTEMS
1 . Scope
1.1 implementation objectives
It is intended that an implementation of a floating-point system
conforming to. this standard can be realized entirely in software,
entirely in hardware, or in any combination of software and hardware.
It is the environment that the programmer or user of the system sees
that conforms or fails to conform ~ to this standard. Hardware
components that require software support to conform* shall not be said
to conform apart from such software.
1.2 inclusions
This standard specifies:
1) basic and+ extended floating-point number formats;
_ 2) add, subtract, multiply, divide, square root, remainder and
compare operations;
3) conversions between integer and floating-point numbers;
4) conversions between different floating-point formats;
conversions between basic format floating-point numbers and
5)
decimal strings, and
6) floating-point exceptions and their handling, including non-
numbers (NaNs).
1.3 Exclusions
This standard does not specify:
1) formats of decimal strings and integers;
2) interpretation of the signs and significant fields of NaNs, or
3) binary f-+ decimal conversions to and from extended formats.
2 . Definitibns
Biased exponent
,
. The sum of the exponent and a constant (bias) chosen to make the
c
biased exponent’s range non-negative.

---------------------- Page: 9 ----------------------
- 8 -
Nombre binaire en virgule flottante
Chaine de bits caracterisee par trois elements: un signe, un expo-
sant sign6 et une mantisse. Sa valeur numerique, si elle existe, est le
produit sign6 de sa mantisse par deux eleve a la puissance de son
exposant. Dans la presente norme, une chaine de bits n’est pas
toujours distinguee du nombre qu’elle represente.
Nombre dhormalisk
Nombre en virgule flottante non nul dont I’exposant a une valeur
d’habitude la valeur minimale du format et dont le bit
reservee,
significatif de la mantisse, explicite. ou implicite, est nul.
Destination
Emplacement devant contenir le resultat d’une operation binaire ou
unaire. La destination peut etre soit designee explicitement par I’uti-
lisateur ou fournie de maniere implicite par le systeme (pour les
resultats intermediaires dans les sous-expressions ou les arguments de
procedures par exemple). Certains lang’ages placent les resultats des
calculs intermediaires dans des emplacements non accessibles par
I’utilisateur. Neanmoins, cette norme definit le resultat d’une operation
en termes du format de cette destination aussi bien que des valeurs
des operandes.
Exposant
Element d’un nombre binaire en virgule flottante qui represente
normalement la puissance entiere a laquelle deux est eleve pour
determiner la valeur du nombre represent& Occasionnellement,
l’exposant est appele exposant sign6 ou exposant sans excedent.
f ractionnaire
Par-tie
Partie de la mantisse situee 5 droite de sa virgule correspondante.
Mode
Variable qu’un utilisateur peut positionner, tester, sauvegarder et
restaurer pour diriger I’execution des operations arithmetiques ulte-
rieu res . Le mode par defaut est le mode valable tant qu’une instruction
contraire explicite nest pas incluse dans le programme ou sa specifi-
cation.
Les modes suivants doivent etre mis en place:
1) arrondi, pour commander la direction des erreurs d’arrondi, et
,dans certaines realisations;
precision de I’arrondi, pour diminuer la precision des resultats. Le
2)
realisateur peut fournir optionnellement les. modes suivants:
3) deroutements desactives/actives, pour gerer les conditions
d’exception.

---------------------- Page: 10 ----------------------
-9-
559 @ I’EC
Binary floating-point number .
A bit-string characterized- by three components: a sign, a signed
and a significand. Its numerical value, if any, is the signed
exponent,
significand and two raised to the power of its
product of its,
exponent. In this standard a bit-string is not always distinguished
from a number it may represent.
Denormalized number
A nonzero floating-point number, the exponent of which has a
usually the format’s minimum, and the explicit or
res.erved value,
implicit leading significant bit of which is zero.
Destination
The location for the result of a binary or unary operation. The
destination may be either explicitly designated by the user or implicitly
supplied by the system (e.g. intermediate results in sub-expressions .
or arguments for procedures). Some languages place the results of
intermediate calculations in destinations beyond the user’s control.
this standard defines the result of an operation in terms
Nonetheless,
of that destination’s format as well as the operands’ values.
Exponent
The component of a binary floating-point number that normally
signifies the integer power to which two is raised in determining the
exponent is called
value of the represented number. Occasionally the
the signed or unbiased exponent.
Fraction
The field of the significand that lies to the right of its imphed
binary point.
Mode
A variable that a user may set, sense, save and restore, to control
the execution of subsequent arithmetic operations. The default mode is
the mode that a program can assume- to be in effect unless an
explicitly contrary statement is included either in the program or in its .
specification.
The following modes shall be implemented:
to control the- direction of rounding errors, and in
1) rounding,
certain implementations a
to shorten the precision of results. The imple-
2) rounding precision,
at his option, implement the following modes:
mentor may,
3) traps disa.bled/enabled, to handle exceptions.

---------------------- Page: 11 ----------------------
- 10 - 559 0 CEI
NaN (non-nombre]
Non-nombre; entite symbolique codee selon le format virgule
flottante. II existe deux types de non-nombres (voir 6.2). Les non-
nombres indicateurs indiquent une condition d’exception concernant une
operation invalide (voir 7.1) lorsqu’ils apparaissent en tant qu’ope-
randes. Les non-nombres muets se propagent a travers ‘presque toutes
les operations arithmetiques sans signaler d’exceptions.
R&u/tat - ’
Cha’ine de bits (representant generalement un nombre) qui est livree
a la destination.
Man tisse
Ekment d’un nombre, binaire en virgule flottante constituee d’un bit
significatif explicite ‘ou implicite place a gauche de la virgule et d’un
champ fractionnaire a droite de la virgule.
Doit
Le mot “doit” recouvre les parties obligatoires de toute realisation
conforme.
II convient - II est recommandk - II y a lieu
Ces termes recouvrent les parties qui sont fortement recommandees
comme etant dans l’esprit de cette norme, bien que des contraintes
architecturales ou autres, hors du domaine de cette norme, puissent a
I’occasion rendre ces recommandations peu pratiques.
lndicateur dWat
Variable qui peut prendre deux etats, actif (valeur 1) ou inactif
Un utilisateur peut desactiver un indicateur, le copier, ou
(valeur 0).
le ‘remettre dans un &at anterieur. Lorsqu’il est actif, un indicateur
.
peut contenir des informations supplementaires dependant du systeme
et Gventuellement inaccessibles 5 certains utilisateurs. Les operations
definies par cette norme peuvent avoir comme effet secondaire- I’acti-
vation de certains des indicateurs suivants: resultat inexact, depas-
sement de capacite inferieur, depassement de capacite superieur,
division par zero et operation invalide.
U tilisateur
Toute personne, tout materiel ou logiciel non specific lui-meme dans
cette norme, ayant acces aux operations de I’environnement de pro-
grammation specifiees dans cette norme et qui les commande.
3 . Formats
Cette norme definit quatre formats de virgule flottante en deux
de base et etendu, chacun admettant deux largeurs, s,imple
groupes,
et double precision. Les niveaux de realisation standard se distinguent
par les combinaisons de formats support&.

---------------------- Page: 12 ----------------------
- 11 -
559 0 IEC
NaN
Not a number; a symbolic entity encoded in floating-point format.
There are two types of NaNs (see 6.2). Signalling NaNs signal the
invalid operation exception (see 7.1) whenever they appear as
operands. Quiet NaNs propagate through almost every arithmetic
operation without signalling exceptions.
Result
The bit-string (usually representing a number) that is delivered to
the destination.
Signif icant
The component of a binary floating-point number which consists of
an explicit ‘or implicit leading bit to the left of its implied binary point
and a fraction field to the right.
Shall
The word “shall” signifies that which is obligatory in any conforming
implementation.
Should
The, word “should” signifies that which is strongly recommended as
being keeping with the intent of the standard, although
architectural or other constraints beyond the scope of this standard
on occasion, render the recommendations impractical.
may,
Status flag
A variable that may take two states, set and clear. A user may clear
a flag, it, or restore it to a previous state. When set, status
COPY
flag may contain . additional system-dependent information, possibly
inaccessible to some users. The operations of this standard may, as a
side-effect, set some of the followi.ng flags: inexact result, underflow,
overflow, divide by zero and invalid operation.
User
not itself s$ecified- by this
Any person, hardware, or program
to and controlling those operations of the
standard, having access
’
programming environment specified in this standard.
3 . Formats
This standard defines four floating-point formats in two groups,
basic and extended, each having two widths, single and double. The
standard levels of implementation are distin,guished by. the combinations
of formats supported.

---------------------- Page: 13 ----------------------
- 12 - 559 @ CEI
3.1 Ensembles de valeurs
Ce paragraphe concerne seulement les valeurs numeriques
repre-
sentables dans un format, et non leur codage qui fait l’objet des
Les seules valeurs representables dans un
paragraphes suivants.
format donne sont celles qui sont specifiees selon les trois parametres
entiers suivants:
= nombre de bits significatifs (precision)
P
=
E valeur maximale de l’exposant, et
max
=
E valeur minimale de I’exposant
min
Les parametres de chaque format sont regroup& dans le tableau 1.
Pour chaque format, les Mules entites qui doivent etre fournies sont:
Nombres de la forme (-l)s2E(b,blb2 . . . bp I)
s vaut 0 ou 1;
E est un entier compris entre E et E bornes incluses, et
min max
vaut 0 ou 1.
chaque b,
1
Deux valeurs infinies, +w et -=;
au moins un non-nombre indicateur, et
au moins un non-nombre muet.
Tableau 1 - Resume des parametres du format
Format
Paramstre
Simple Dbuble
Simple
Etendu Etendu
I
P 24 232 53 264
Z+l 023
E +127 +l 023 z+14 383
max
E -126 L-l 022 -1 022 5-16 382
min
Excgdent de l'exposanf +127 Non +l 023 Non
sp6cifi6 spGcifi6
Largeur de l'exposant (bits) 8 211 11 215
32 243
Largeur du format (bits) .64 279
L
La description prkedente enumere certaines valeurs de maniere
redondante, par exemple:
2O(l .O) = 2l(O.l) = 22(o.ol) = . . .
le codage de telles valeurs non nulles peut etre
Cependant,
redondant seulement pour les formats etendus (voir 3.3). Les valeurs
non nulles de la forme t2Emin (O*b,b, . . . bpB1) sont appelees

---------------------- Page: 14 ----------------------
- 13 -
559 @ IEC
3.1 Sets of values
This sub-clause concerns only the numerical values representable
within a format, not the encodings which are the subject of the
following sub-clauses. The only values representable in a chosen
format are those specified via the following three integer parameters:
= number of significant bits (precision)
P
E = maximum exponent, and
max
f = minimum exponent
min
Each format’s parameters are displayed in Table ‘I. Within each
format just the following entities shall be provided:
Numbers of the form (-l)S2E(boblb2 . . . bp I>
where:
s is 0 or 1;
E is any integer between E and E inclusive, and each b is 0
min max i
or 1.
and -w;
Two infinities, +a~
ling NaN, and
at least one signa
NaN.
at least one quiet
Table 1 - Summary of format parameters
Format
Parameter
Double
Single
Double
Single
Extended
Extended
P 24 232 53 264
+l 023 2+16. 383
E +127 2+1 023
max
h-l 022 -1 022 S-16 382
E -126
min
Unspeci- +l 023 Unspeci-
Exponent bias +127
fied fied
8 211 11 215
Exponent width (bits)
32 243 64 279
Format width Chits)
The foregoing description enumerates some values redundantly, for
example:
2O(l .O) = 2l(O.l) = 22(o.ol) = . . .
However, the encodings of such nonzero values may be redundant
only in extended formats (see 3.3). The nonzero values of the form
E
22 min (OmbIbT . . . bpJ are called denormalized. Reserved exponents

---------------------- Page: 15 ----------------------
- 14 - 559 @ CEI
“denormalisees”. Des valeurs ritservees d’exposants peuvent etre
utilisees pour coder les non-nombres, kw, 20, et les nombres denor-
malises. Pour toute vari’able ayant la valeur zero, le bit de signe s
fournit un bit supplementaire d’information. Bien que tous les formats
aient des representations distinctes pour +0 et -0, les signes sont
.
significatifs en certaines circonstances, comme la division par zero, et
non dans d’autres. Dans cette norme, 0 et w sont ecrits sans signe
lorsque ce dernier n’a pas d’importance.
3. 2 Formats de base
simple et double precision se composent de,
Les nombres en format
trois champs:
un signe s de 1 bit,
un exposant avec excedent e = E + excedent, et
b
une partie fractionnaire f = l bIb2 . . .
p-l'
Le domaine des valeurs de l’exposant sans excedent E doit inclure
tout entier place entre les bornes Emin et deux autres valeurs r6servees: Emin- pour coder +-0 et les ‘nombres
denormalises, et Emax+ pour coder 2~ et les non-nombres. Les para-
metres ci-dessus apparaissent dans le tableau I. Chaque valeur nume-
nulle possede un codage unique. Les champs sont inter-
rique non
pretes comme suit:
3.2.1 Simple pre
...

IEC 559:1989

Binary floating-point arithmetic for microprocessor systems

Binary floating-point arithmetic for microprocessor systems

Arithmétique binaire en virgule flottante pour systèmes à microprocesseurs

General Information

Relations

Buy Standard

Standards Content (Sample)

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Binary floating-point arithmetic for microprocessor systems

Arithmétique binaire en virgule flottante pour systèmes à microprocesseurs

General Information

Relations

Buy Standard

Standards Content (Sample)

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

This May Also Interest You