Welcome
Username or Email:

Password:


Missing Code




[ ]
[ ]
Online
  • Guests: 20
  • Members: 0
  • Newest Member: omjtest
  • Most ever online: 396
    Guests: 396, Members: 0 on 12 Jan : 12:51
Members Birthdays:
One birthday today, congrats!
Vaxian (17)


Next birthdays
05/21 Dalus (34)
05/21 Kizmo (37)
05/22 Skynet (32)
Contact
If you need assistance, please send an email to forum at 4hv dot org. To ensure your email is not marked as spam, please include the phrase "4hv help" in the subject line. You can also find assistance via IRC, at irc.shadowworld.net, room #hvcomm.
Support 4hv.org!
Donate:
4hv.org is hosted on a dedicated server. Unfortunately, this server costs and we rely on the help of site members to keep 4hv.org running. Please consider donating. We will place your name on the thanks list and you'll be helping to keep 4hv.org alive and free for everyone. Members whose names appear in red bold have donated recently. Green bold denotes those who have recently donated to keep the server carbon neutral.


Special Thanks To:
  • Aaron Holmes
  • Aaron Wheeler
  • Adam Horden
  • Alan Scrimgeour
  • Andre
  • Andrew Haynes
  • Anonymous000
  • asabase
  • Austin Weil
  • barney
  • Barry
  • Bert Hickman
  • Bill Kukowski
  • Blitzorn
  • Brandon Paradelas
  • Bruce Bowling
  • BubeeMike
  • Byong Park
  • Cesiumsponge
  • Chris F.
  • Chris Hooper
  • Corey Worthington
  • Derek Woodroffe
  • Dalus
  • Dan Strother
  • Daniel Davis
  • Daniel Uhrenholt
  • datasheetarchive
  • Dave Billington
  • Dave Marshall
  • David F.
  • Dennis Rogers
  • drelectrix
  • Dr. John Gudenas
  • Dr. Spark
  • E.TexasTesla
  • eastvoltresearch
  • Eirik Taylor
  • Erik Dyakov
  • Erlend^SE
  • Finn Hammer
  • Firebug24k
  • GalliumMan
  • Gary Peterson
  • George Slade
  • GhostNull
  • Gordon Mcknight
  • Graham Armitage
  • Grant
  • GreySoul
  • Henry H
  • IamSmooth
  • In memory of Leo Powning
  • Jacob Cash
  • James Howells
  • James Pawson
  • Jeff Greenfield
  • Jeff Thomas
  • Jesse Frost
  • Jim Mitchell
  • jlr134
  • Joe Mastroianni
  • John Forcina
  • John Oberg
  • John Willcutt
  • Jon Newcomb
  • klugesmith
  • Leslie Wright
  • Lutz Hoffman
  • Mads Barnkob
  • Martin King
  • Mats Karlsson
  • Matt Gibson
  • Matthew Guidry
  • mbd
  • Michael D'Angelo
  • Mikkel
  • mileswaldron
  • mister_rf
  • Neil Foster
  • Nick de Smith
  • Nick Soroka
  • nicklenorp
  • Nik
  • Norman Stanley
  • Patrick Coleman
  • Paul Brodie
  • Paul Jordan
  • Paul Montgomery
  • Ped
  • Peter Krogen
  • Peter Terren
  • PhilGood
  • Richard Feldman
  • Robert Bush
  • Royce Bailey
  • Scott Fusare
  • Scott Newman
  • smiffy
  • Stella
  • Steven Busic
  • Steve Conner
  • Steve Jones
  • Steve Ward
  • Sulaiman
  • Thomas Coyle
  • Thomas A. Wallace
  • Thomas W
  • Timo
  • Torch
  • Ulf Jonsson
  • vasil
  • Vaxian
  • vladi mazzilli
  • wastehl
  • Weston
  • William Kim
  • William N.
  • William Stehl
  • Wesley Venis
The aforementioned have contributed financially to the continuing triumph of 4hv.org. They are deserving of my most heartfelt thanks.
Forums
4hv.org :: Forums :: Computer Science
« Previous topic | Next topic »   

Unsigned integer division by 10

1 2 
Move Thread LAN_403
Bjørn
Sat Dec 30 2006, 12:09PM Print
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
;Works for numbers 0-16388
;Also works with "nice" numbers like 100000
add 	r1,r0,r0,lsl #1
add 	r0,r0,r1,lsl #2
add 	r0,r0,r1,lsl #6
add 	r0,r0,r1,lsl #10
mov	 r0,r0,lsr #15

It uses the formula n=(n*3277)/32768
Since the division by 32768 usuallly is free, it is normally four clock cycles. Does anyone know a very fast way that works with any 32 bit number?
Back to top
Carbon_Rod
Sun Dec 31 2006, 05:12AM
Carbon_Rod Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
I am not sure what your core register set .

However, most 32 bit cpu will support hardware division and register swapping. (or in C one can attempt to implement as "register" level variable and hope the compiler is smart enough to do it for you.)

Most architectures I have seen do a few things:
1.) support Divide by zero error detection.
2.) Store the modulus remainder.
3.) May have separate signed or unsigned Mul/Div operators.

There is a similar 4 decimal accurate root function that is done in a manner to what you have described. Most vendors will have free asm libs that do such tasks for you.


Back to top
Bjørn
Sun Dec 31 2006, 05:47AM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
This is the instruction set, except the Vector instructions.
http://wiki.4hv.org/index.php/Instruction_set:_ARM

Multiply with 3277 takes 3 cycles and loading 3277 as a constant takes 2 cycles so it will take one cycle more to use the multiply instruction.

The Keil CARM C compiler uses more than 100 cycles to divide by 10. I would like a general divide by 10 in less than 20 cycles.
Back to top
Carbon_Rod
Sun Dec 31 2006, 10:11AM
Carbon_Rod Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
A faster clock may be needed if you are that fixed for time efficient code. ;D
(or a space vs. time trade off.)

Check out this info etc. The simulation tool may be helpful. (lecture5 also talks about the ARM div limits.)
Link2

Personally, I prefer to compile to intermediate asm then analyze and optimize the routines there to export back into C as a block of ASM code. Of course the asm lib objects can be linked in with relative ease on most systems too. But register context preservation on any interruptible stack can be tricky to debug (expect to see a rise of volatile and static variable types.)

The ARM GCC is not that bad cheesey

Is this the 32bit by 16bit code you were referring to?
AVR code ( Link2 )


;divide r21:r18/refH:refL result in XH:XL
__DIVD21U:
CLR XL
CLR XH
CLR R0
CLR R1
LDI R17,32
__DIVD21U1:
LSL R18
ROL R19
ROL R20
ROL R21
ROL XL
ROL XH
ROL R0
ROL R1
SUB XL,scaleL
SBC XH,scaleH
SBC R0,zeroreg
SBC R1,zeroreg
BRCC __DIVD21U2
ADD XL,scaleL
ADC XH,scaleH
ADC R0,zeroreg
ADC R1,zeroreg
RJMP __DIVD21U3
__DIVD21U2:
SBR R18,1
__DIVD21U3:
DEC R17
BRNE __DIVD21U1
mov XL, r19
mov XH, r18 ; result in X

ret


Back to top
Bjørn
Sun Dec 31 2006, 01:53PM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
The Keil C compiler uses an algorithm that looks very similar to that, it is short but it still is 100+ cycles even on small numbers.

udiv32	  MOV     R3,#0
			MOVS    R2,R1
			moveq	pc,lr			;Divide by 0
udiv32l1	CMP     R2,R0,LSR #1
			MOVLS   R2,R2,LSL #1
			BCC     udiv32l1
udiv32l2	CMP     R0,R2
			ADC     R3,R3,R3
			SUBCS   R0,R0,R2
			TEQ     R2,R1
			MOVNE   R2,R2,LSR #1
			BNE     udiv32l2
			MOV	 PC,LR


I have a different algorithm that is more than twice as fast, but it still is not as fast as I would like for division by 10.
Back to top
Steve Conner
Sun Dec 31 2006, 02:17PM
Steve Conner Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706
If you have a hardware multiplier, you should be able to divide X by Y in about as many cycles as Y has bits. For example, the DSP I use divides a 32-bit numerator by a 16-bit denominator in 16 cycles. I think it's basically a successive approximation where it has 16 guesses at the answer, and each time multiplies the guess by the denominator and compares it to the numerator, and sets or clears the appropriate bit in the guess, depending on whether it was too high or too low.

Division is still 16 times more expensive than multiplication, so if I want to divide an array of numbers by a constant, I use the division algorithm to find 1/constant and then multiply each number in the array by that.
Back to top
Bjørn
Sun Dec 31 2006, 03:10PM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
My first thought is that if X=0xFFFFFFFF and Y=0x1 then the answer is 0xFFFFFFFF. That means setting 32 bits in 16 cycles. I am not sure how that would work with successive approximation.
Back to top
Steve Conner
Mon Jan 01 2007, 04:35PM
Steve Conner Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706
Well, it only gives a 16-bit result. If the result is too big to fit in 16 bits then it will be wrong.
Back to top
Bjørn
Mon Jan 01 2007, 04:42PM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
Then it makes sense.

Set, multiply, compare, clear, that would be 6 cycles for each bit on my chip so 96 cycles in total.
Back to top
Steve Conner
Tue Jan 02 2007, 12:17AM
Steve Conner Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706
I forgot to say, there's a fundamental distinction between dividing a variable by a constant and dividing a variable by another variable. The difference being that you can precompute the inverse of a constant and use multiplication at runtime, making it far faster.

Bjoern, when you asked for a method that works with any number, I and the other posters assumed you wanted to divide a variable by a variable. OTOH, if you want to just divide a variable by other constants than 10, I guess you can use the same method you're using just now. If I understand right, you precomputed your approximation to 1/10 as 3277/32768, and then multiplied the variable by 3277 using a series of shifts and adds, and finally divided it by 32768 with a shift right.

You can extend the method to any constant. For instance, 1/100 is 328/32768. It'll take about as many cycles as there are 1's in the binary representation of the quantity that appears as 3277 above.
Back to top
1 2 

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob

Go to:

Powered by e107 Forum System
 
Legal Information
This site is powered by e107, which is released under the GNU GPL License. All work on this site, except where otherwise noted, is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. By submitting any information to this site, you agree that anything submitted will be so licensed. Please read our Disclaimer and Policies page for information on your rights and responsibilities regarding this site.