If you need assistance, please send an email to forum at 4hv dot org. To ensure your email is not marked as spam, please include the phrase "4hv help" in the subject line. You can also find assistance via IRC, at irc.shadowworld.net, room #hvcomm.
Support 4hv.org!
Donate:
4hv.org is hosted on a dedicated server. Unfortunately, this server costs and we rely on the help of site members to keep 4hv.org running. Please consider donating. We will place your name on the thanks list and you'll be helping to keep 4hv.org alive and free for everyone. Members whose names appear in red bold have donated recently. Green bold denotes those who have recently donated to keep the server carbon neutral.
Special Thanks To:
Aaron Holmes
Aaron Wheeler
Adam Horden
Alan Scrimgeour
Andre
Andrew Haynes
Anonymous000
asabase
Austin Weil
barney
Barry
Bert Hickman
Bill Kukowski
Blitzorn
Brandon Paradelas
Bruce Bowling
BubeeMike
Byong Park
Cesiumsponge
Chris F.
Chris Hooper
Corey Worthington
Derek Woodroffe
Dalus
Dan Strother
Daniel Davis
Daniel Uhrenholt
datasheetarchive
Dave Billington
Dave Marshall
David F.
Dennis Rogers
drelectrix
Dr. John Gudenas
Dr. Spark
E.TexasTesla
eastvoltresearch
Eirik Taylor
Erik Dyakov
Erlend^SE
Finn Hammer
Firebug24k
GalliumMan
Gary Peterson
George Slade
GhostNull
Gordon Mcknight
Graham Armitage
Grant
GreySoul
Henry H
IamSmooth
In memory of Leo Powning
Jacob Cash
James Howells
James Pawson
Jeff Greenfield
Jeff Thomas
Jesse Frost
Jim Mitchell
jlr134
Joe Mastroianni
John Forcina
John Oberg
John Willcutt
Jon Newcomb
klugesmith
Leslie Wright
Lutz Hoffman
Mads Barnkob
Martin King
Mats Karlsson
Matt Gibson
Matthew Guidry
mbd
Michael D'Angelo
Mikkel
mileswaldron
mister_rf
Neil Foster
Nick de Smith
Nick Soroka
nicklenorp
Nik
Norman Stanley
Patrick Coleman
Paul Brodie
Paul Jordan
Paul Montgomery
Ped
Peter Krogen
Peter Terren
PhilGood
Richard Feldman
Robert Bush
Royce Bailey
Scott Fusare
Scott Newman
smiffy
Stella
Steven Busic
Steve Conner
Steve Jones
Steve Ward
Sulaiman
Thomas Coyle
Thomas A. Wallace
Thomas W
Timo
Torch
Ulf Jonsson
vasil
Vaxian
vladi mazzilli
wastehl
Weston
William Kim
William N.
William Stehl
Wesley Venis
The aforementioned have contributed financially to the continuing triumph of 4hv.org. They are deserving of my most heartfelt thanks.
Registered Member #27
Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
;Works for numbers 0-16388
;Also works with "nice" numbers like 100000
add r1,r0,r0,lsl #1
add r0,r0,r1,lsl #2
add r0,r0,r1,lsl #6
add r0,r0,r1,lsl #10
mov r0,r0,lsr #15
It uses the formula n=(n*3277)/32768 Since the division by 32768 usuallly is free, it is normally four clock cycles. Does anyone know a very fast way that works with any 32 bit number?
Registered Member #65
Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
I am not sure what your core register set .
However, most 32 bit cpu will support hardware division and register swapping. (or in C one can attempt to implement as "register" level variable and hope the compiler is smart enough to do it for you.)
Most architectures I have seen do a few things: 1.) support Divide by zero error detection. 2.) Store the modulus remainder. 3.) May have separate signed or unsigned Mul/Div operators.
There is a similar 4 decimal accurate root function that is done in a manner to what you have described. Most vendors will have free asm libs that do such tasks for you.
Registered Member #65
Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
A faster clock may be needed if you are that fixed for time efficient code. ;D (or a space vs. time trade off.)
Check out this info etc. The simulation tool may be helpful. (lecture5 also talks about the ARM div limits.)
Personally, I prefer to compile to intermediate asm then analyze and optimize the routines there to export back into C as a block of ASM code. Of course the asm lib objects can be linked in with relative ease on most systems too. But register context preservation on any interruptible stack can be tricky to debug (expect to see a rise of volatile and static variable types.)
The ARM GCC is not that bad
Is this the 32bit by 16bit code you were referring to? AVR code ( )
;divide r21:r18/refH:refL result in XH:XL __DIVD21U: CLR XL CLR XH CLR R0 CLR R1 LDI R17,32 __DIVD21U1: LSL R18 ROL R19 ROL R20 ROL R21 ROL XL ROL XH ROL R0 ROL R1 SUB XL,scaleL SBC XH,scaleH SBC R0,zeroreg SBC R1,zeroreg BRCC __DIVD21U2 ADD XL,scaleL ADC XH,scaleH ADC R0,zeroreg ADC R1,zeroreg RJMP __DIVD21U3 __DIVD21U2: SBR R18,1 __DIVD21U3: DEC R17 BRNE __DIVD21U1 mov XL, r19 mov XH, r18 ; result in X
Registered Member #30
Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706
If you have a hardware multiplier, you should be able to divide X by Y in about as many cycles as Y has bits. For example, the DSP I use divides a 32-bit numerator by a 16-bit denominator in 16 cycles. I think it's basically a successive approximation where it has 16 guesses at the answer, and each time multiplies the guess by the denominator and compares it to the numerator, and sets or clears the appropriate bit in the guess, depending on whether it was too high or too low.
Division is still 16 times more expensive than multiplication, so if I want to divide an array of numbers by a constant, I use the division algorithm to find 1/constant and then multiply each number in the array by that.
Registered Member #27
Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
My first thought is that if X=0xFFFFFFFF and Y=0x1 then the answer is 0xFFFFFFFF. That means setting 32 bits in 16 cycles. I am not sure how that would work with successive approximation.
Registered Member #30
Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706
I forgot to say, there's a fundamental distinction between dividing a variable by a constant and dividing a variable by another variable. The difference being that you can precompute the inverse of a constant and use multiplication at runtime, making it far faster.
Bjoern, when you asked for a method that works with any number, I and the other posters assumed you wanted to divide a variable by a variable. OTOH, if you want to just divide a variable by other constants than 10, I guess you can use the same method you're using just now. If I understand right, you precomputed your approximation to 1/10 as 3277/32768, and then multiplied the variable by 3277 using a series of shifts and adds, and finally divided it by 32768 with a shift right.
You can extend the method to any constant. For instance, 1/100 is 328/32768. It'll take about as many cycles as there are 1's in the binary representation of the quantity that appears as 3277 above.
This site is powered by e107, which is released under the GNU GPL License. All work on this site, except where otherwise noted, is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. By submitting any information to this site, you agree that anything submitted will be so licensed. Please read our Disclaimer and Policies page for information on your rights and responsibilities regarding this site.