If you need assistance, please send an email to forum at 4hv dot org. To ensure your email is not marked as spam, please include the phrase "4hv help" in the subject line. You can also find assistance via IRC, at irc.shadowworld.net, room #hvcomm.
Support 4hv.org!
Donate:
4hv.org is hosted on a dedicated server. Unfortunately, this server costs and we rely on the help of site members to keep 4hv.org running. Please consider donating. We will place your name on the thanks list and you'll be helping to keep 4hv.org alive and free for everyone. Members whose names appear in red bold have donated recently. Green bold denotes those who have recently donated to keep the server carbon neutral.
Special Thanks To:
Aaron Holmes
Aaron Wheeler
Adam Horden
Alan Scrimgeour
Andre
Andrew Haynes
Anonymous000
asabase
Austin Weil
barney
Barry
Bert Hickman
Bill Kukowski
Blitzorn
Brandon Paradelas
Bruce Bowling
BubeeMike
Byong Park
Cesiumsponge
Chris F.
Chris Hooper
Corey Worthington
Derek Woodroffe
Dalus
Dan Strother
Daniel Davis
Daniel Uhrenholt
datasheetarchive
Dave Billington
Dave Marshall
David F.
Dennis Rogers
drelectrix
Dr. John Gudenas
Dr. Spark
E.TexasTesla
eastvoltresearch
Eirik Taylor
Erik Dyakov
Erlend^SE
Finn Hammer
Firebug24k
GalliumMan
Gary Peterson
George Slade
GhostNull
Gordon Mcknight
Graham Armitage
Grant
GreySoul
Henry H
IamSmooth
In memory of Leo Powning
Jacob Cash
James Howells
James Pawson
Jeff Greenfield
Jeff Thomas
Jesse Frost
Jim Mitchell
jlr134
Joe Mastroianni
John Forcina
John Oberg
John Willcutt
Jon Newcomb
klugesmith
Leslie Wright
Lutz Hoffman
Mads Barnkob
Martin King
Mats Karlsson
Matt Gibson
Matthew Guidry
mbd
Michael D'Angelo
Mikkel
mileswaldron
mister_rf
Neil Foster
Nick de Smith
Nick Soroka
nicklenorp
Nik
Norman Stanley
Patrick Coleman
Paul Brodie
Paul Jordan
Paul Montgomery
Ped
Peter Krogen
Peter Terren
PhilGood
Richard Feldman
Robert Bush
Royce Bailey
Scott Fusare
Scott Newman
smiffy
Stella
Steven Busic
Steve Conner
Steve Jones
Steve Ward
Sulaiman
Thomas Coyle
Thomas A. Wallace
Thomas W
Timo
Torch
Ulf Jonsson
vasil
Vaxian
vladi mazzilli
wastehl
Weston
William Kim
William N.
William Stehl
Wesley Venis
The aforementioned have contributed financially to the continuing triumph of 4hv.org. They are deserving of my most heartfelt thanks.
Registered Member #27
Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
Does anyone know any efficient square root algorithms? I made this one in an attempt to make it very short and failsafe but it is the fastest one I have tried out so far.
I was surprised not to find anything faster so if anyone has any ideas or pointers towards something efficient it would be interesting to hear. I saw an algorithm years ago that was very efficient but I can't recall the name.
Registered Member #52
Joined: Thu Feb 09 2006, 04:22AM
Location: Austin TX
Posts: 57
The code you've shown is equivalent to the following C code:
int const isqrt( int const square )
{
int guess = 0;
int bit = 1 << 15; // do the first 15 LSBs
while ( bit > 0 )
{
guess |= bit; // build up a value
if ( guess * guess > square ) // is the square of the guess bigger than the orginal?
guess &= ~bit; // remove the bit
bit >>= 1;
}
return guess;
}
It doesn't scale very well beyond 32 bit inputs and will always take 96 instructions to execute, but the algorithm you choose is dependant on your situation. If you're ever going to be taking the square root of 8 bit numbers you would use a different algorithm from if you were getting the square root of an 80 bit floating point number.
Normally table lookup with a Newton iteration like Newton-Raphson yeilds fast and accurate results for larger numbers. A purely iterative approach works really well for small numbers. It's always a trade off of overhead
I have used the BSR trick on Intel that is pointed out in the link ShawnLG provided to get an approximation before doing an iterative approach, as well.
Registered Member #27
Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
My code does exactly the same as the following C code except that it exits early if it finds an exact result.
Table lookup in Flash memory is quite slow and the accuracy is not what I was aiming for but I have made a note of the <20 cycle algorithm for other uses.
I tried the sum of odds and i was faster on 8 bit numbers than my first attempt but if I configure my code for 8 bit then it becomes a lot faster than the sum of odds.
Registered Member #32
Joined: Sat Feb 04 2006, 08:58AM
Location: Australia
Posts: 549
The Newton method should work well. To keep the options coming, what about a Taylor expansion? Ie. approximating the square-root function with a polynomial. Each term in your approximation should only involve two extra multiplications at most.
This isn't the best applicaton for Taylor approximations, however. Exponents and trig are better examples.
I guess this is meant to be a general algorithm, otherwise that would change things a lot.
Registered Member #27
Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
It is supposed to be a general purpose algorithm for a microcomputer and have these properties that does not mix well:
Predictable
Accurate
Fast
Small
My first attempt unrolls well and I can use an unrolled version for 16 bit numbers and use that for small numbers and to approximate the square root of large numbers and use the looped version for 32 bit numbers.
Registered Member #65
Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
For low speed 8-bit real time fixed-root core design a look-up style hash table is nice as it uses no ram. =) Its cost is O (1) so it’s fast, simple, and predictable. There is a space-time trade off.. However, it also offers controlled number rounding if a weighted result is needed.
Did you mean an n-th root function or specifically square root? Estimated results using successive approximation techniques are common.
Registered Member #27
Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
I was specifically thinking about square roots since that is what I needed at the time. I would be interested in efficient implementations of all kinds of useful functions, including IEEE 754 floating point functions. There is only room for 8192 instructions and there is a lot of other things to be included.
This site is powered by e107, which is released under the GNU GPL License. All work on this site, except where otherwise noted, is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. By submitting any information to this site, you agree that anything submitted will be so licensed. Please read our Disclaimer and Policies page for information on your rights and responsibilities regarding this site.