Welcome
Username or Email:

Password:


Missing Code




[ ]
[ ]
Online
  • Guests: 79
  • Members: 0
  • Newest Member: omjtest
  • Most ever online: 396
    Guests: 396, Members: 0 on 12 Jan : 12:51
Members Birthdays:
All today's birthdays', congrats!
Download (31)
ScottH (37)


Next birthdays
11/03 Electroguy (94)
11/04 nitromarsjipan (2024)
11/04 mb (31)
Contact
If you need assistance, please send an email to forum at 4hv dot org. To ensure your email is not marked as spam, please include the phrase "4hv help" in the subject line. You can also find assistance via IRC, at irc.shadowworld.net, room #hvcomm.
Support 4hv.org!
Donate:
4hv.org is hosted on a dedicated server. Unfortunately, this server costs and we rely on the help of site members to keep 4hv.org running. Please consider donating. We will place your name on the thanks list and you'll be helping to keep 4hv.org alive and free for everyone. Members whose names appear in red bold have donated recently. Green bold denotes those who have recently donated to keep the server carbon neutral.


Special Thanks To:
  • Aaron Holmes
  • Aaron Wheeler
  • Adam Horden
  • Alan Scrimgeour
  • Andre
  • Andrew Haynes
  • Anonymous000
  • asabase
  • Austin Weil
  • barney
  • Barry
  • Bert Hickman
  • Bill Kukowski
  • Blitzorn
  • Brandon Paradelas
  • Bruce Bowling
  • BubeeMike
  • Byong Park
  • Cesiumsponge
  • Chris F.
  • Chris Hooper
  • Corey Worthington
  • Derek Woodroffe
  • Dalus
  • Dan Strother
  • Daniel Davis
  • Daniel Uhrenholt
  • datasheetarchive
  • Dave Billington
  • Dave Marshall
  • David F.
  • Dennis Rogers
  • drelectrix
  • Dr. John Gudenas
  • Dr. Spark
  • E.TexasTesla
  • eastvoltresearch
  • Eirik Taylor
  • Erik Dyakov
  • Erlend^SE
  • Finn Hammer
  • Firebug24k
  • GalliumMan
  • Gary Peterson
  • George Slade
  • GhostNull
  • Gordon Mcknight
  • Graham Armitage
  • Grant
  • GreySoul
  • Henry H
  • IamSmooth
  • In memory of Leo Powning
  • Jacob Cash
  • James Howells
  • James Pawson
  • Jeff Greenfield
  • Jeff Thomas
  • Jesse Frost
  • Jim Mitchell
  • jlr134
  • Joe Mastroianni
  • John Forcina
  • John Oberg
  • John Willcutt
  • Jon Newcomb
  • klugesmith
  • Leslie Wright
  • Lutz Hoffman
  • Mads Barnkob
  • Martin King
  • Mats Karlsson
  • Matt Gibson
  • Matthew Guidry
  • mbd
  • Michael D'Angelo
  • Mikkel
  • mileswaldron
  • mister_rf
  • Neil Foster
  • Nick de Smith
  • Nick Soroka
  • nicklenorp
  • Nik
  • Norman Stanley
  • Patrick Coleman
  • Paul Brodie
  • Paul Jordan
  • Paul Montgomery
  • Ped
  • Peter Krogen
  • Peter Terren
  • PhilGood
  • Richard Feldman
  • Robert Bush
  • Royce Bailey
  • Scott Fusare
  • Scott Newman
  • smiffy
  • Stella
  • Steven Busic
  • Steve Conner
  • Steve Jones
  • Steve Ward
  • Sulaiman
  • Thomas Coyle
  • Thomas A. Wallace
  • Thomas W
  • Timo
  • Torch
  • Ulf Jonsson
  • vasil
  • Vaxian
  • vladi mazzilli
  • wastehl
  • Weston
  • William Kim
  • William N.
  • William Stehl
  • Wesley Venis
The aforementioned have contributed financially to the continuing triumph of 4hv.org. They are deserving of my most heartfelt thanks.
Forums
4hv.org :: Forums :: General Science and Electronics
« Previous topic | Next topic »   

More 3D reconstruction Q's

1 2 
Move Thread LAN_403
Arkin
Wed Mar 17 2010, 01:16AM Print
Arkin Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53
For those of you who aren't aware, i am working on a project for 3D reconstruction. Main info here and early progress is here: Link2

This is the current result (approx) of the disparity map: Link2

Recently some people at blenderartists made me a script to import points from a text file into Blender for a point cloud. Right now, no real distances are calculated, but just directly related to the displacement value. Link2

These are my questions:

1) In images, further objects are smaller than real life. Do i/ should i compensate for this by using the distance value to calculate the real size? How would that calculation look like? Where would the real sized object be in terms of (x,y)? Should it be scaled with the center the same?

2) Right now, i am just using the left image and building a disparity map on top of it, so objects and map elements align. Should i be using the middle of the 2 detected instances on each image instead? For example:
l.image: ---x----------
r. image: -----------x--

so the calculated displacement is the different (absolute), or 7 units. Normal, the disparity map would be at the left image location. Instead, should i displace it to the right 3.5 units (7 divided by 2)?

3) How would i go about calculating "density", so small patches or pixels are filtered out? Should i simply do a minimum area with a flood fill?

I also had another though on matching the two images (image correspondence). Right now, the Sum of Absolute Differences works to an extent, but has a major limitation of size of the objects detected. If you do go lower, major errors become apparent.

Instead, what if it grabs pixel (x,y), and links it to reference points, say (x+3, y-3) among others. When searching for that "network" it would have to link a certain amount of points at a certain threshold.

The reasons i am hesitating in trying this is:
1) It's complicated. What happens if the reference points don't exist (off the image)? Should i have "backup references"? Random indexed references?

2) Speed. Having so , many networks makes a huge amount of data. For example, a 640x480 picture. That is 307200 pixels. With only 5 reference pixels, which i think is low, that becomes 1536000 points. Not only memory wise, the computer has to search all of these points, search neighboring pixels within the threshold, and so on. You go up 1 threshold point, you go up multiple search areas.

3) Color at the pixel level is very reliable. It may work with enough reference points, than you run into problem 2. Instead of color, you could match the points link's links, for however many levels you want to follow the recursive tree. This would also be computationally intensive, at an exponential rate. That would also be hard for me to code so much recursiveness.

Disregarding speed, do you think using a network of reference points would work?
Regarding speed, do you think this kind of computations is suitable for GPU computing (CUDA or OpenCL)?

Sorry for the big wall of text, trying to be descriptive.

EDIT: Do you think having a network of computers calculate would help, or would internet speed (LAN) just limit it? They all share a public drive, so there is no need to transmit images, but which parts to match up, and transmit back the coordinates to a master.
Back to top
Bjørn
Wed Mar 17 2010, 03:15AM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
This is a lot of stuff...

If you want to render your data you need to take into acount the distance from the camera, wikipedia will tell you all about 3D transformations in a completely unreadable way. You got a camera, why don't you just take a few pictures at different distances and work out the simple formula?

You can speed up the calculations 1000 times by improving your algorithm, I doubt you can get 1000 fast PCs so think about that after everything is working fine. The same with CUDA, I have done some work with it and the speed up is nowhere close to the claims when you compare fairly (optimized program vs optimized program). You will get a faster program that only works on some PCs and will be slower than an optimized original on some of the ones it works on.

You need to think more about your requirements because you are wasting time by not focusing your work on the most important areas. For example the distance between your two cameras will affect your accuracy significantly. Move the cameras closer and everything becomes much simpler. It is realistic to get good subpixel resolution between objects that are not very small so working on that might be a much more efficient way to spend your time. Same with variable distance, unless you got some secret requirement for a fixed distance.
Back to top
Carbon_Rod
Wed Mar 17 2010, 09:04AM
Carbon_Rod Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155
Great tool for raw data:
Link2

a 3D scanner built from a camera & computer projector may work better.
google: structured light scanner

Cheers,
Back to top
Arkin
Wed Mar 17 2010, 11:15PM
Arkin Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53
I have a hard time believing that CUDA woudln't make much difference. Some applications dont gain much, but this is extremley repeititive and recursive. Instead of matching 1 block at a time, you could mach more than 100 at a time. The main speed limiting will be transfering the data to the video ram.

Also, how would having the cameras closer make it more accurate? I purposely did it far for a recognizable change.

I do have access to 60+ computers at my school, each dual core. But yes, i would rather avoid that complexity (networking, job management, etc)
Back to top
Bjørn
Thu Mar 18 2010, 01:50AM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
CUDA is not like having hundreds of separate processors, it is several arrays of ALUs where all units in an array stall unless they all can follow the same code path. The largest speed up I have gotten is 4x when compareing CUDA to a pretty optimal program running on all cores on my PC. A comparison made by NVIDIA would have run it unoptimised on one core and claimed a 50x speedup.

This searching will run particularly well on a GPU since each search operation can be done completely independently. On the other hand a CPU can do it more efficiently because it can filter out blocks in ways that a GPU can't since the GPU must treat each block the same to avoid stalling a large group of ALUs. If you do it right almost all all the data will come out of the cache of a CPU so a GPU can't use the advantage of a huge memory bandwidth. So the CPU will gain back some of the GPU advantage and the speed up will not be able to compete with the improvements a better algorithm will bring.


Moving the cameras closer reduces the rotation of the objects that happen when you view them from a different angle, that makes it easier to find an exact match for your block. It also greatly reduces the number of pixels that are only visible in one picture. This increased signal to noise ratio can be used to make your program faster and more accurate.
Back to top
Arkin
Thu Mar 18 2010, 12:12PM
Arkin Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53
Right now, each block is being treated independently, just on 1 core. Every block does go through the same code.

What ways of filtering are you talking about? I was simply going to calculate the disparities, and go through the resultant array and then filter it out.

I've been expirimenting with closer cameras, and it works well, but i cannot go into larger search block sizes (overlaps then). This is resulting in a noisier final product, with lots of disparities "mixed in".

I am thinking it is now based on filtering, since the SAD algorithm is fairly simple. I plan on also trying the network of reference pixels.

Also, I was going to try and use edge detection. With edge detection, i can make several bounding areas, which are essentilly the bounds of the object. If the bounds areas is less than x (so it's not big like the floor), then all of the disparities inside will be the same.

The problem i have been having with that is the pictures aren't perfect, so i am getting gaps in the lines. Whats a good way to "guess" on where missing line segnments go, based on the surroundings?
Back to top
Bjørn
Thu Mar 18 2010, 02:49PM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
I mean that you can save a lot of time by filtering out blocks that are not worth processing. For example you can check a single pixel and see that it is entirely the wrong colour, the contrast might be so low that any results would be meaningless. Another example, there are several fast methods for finding the brightness of a block and you can skip comparisons for the blocks that have too different brightness. The list of fast methods to filter out hopeless blocks can be quite long. Searching in the frequency domain is also faster.

SAD is not the most accurate way to compare blocks. See for example: Link2 I suggest you try a video compressor and enable motion compensation debugging output and see how well it does on your pictures. Many possibilities exist, for speed a hybrid method that uses different methods depending on the nature of the block might be the most efficient.

Object detection is very difficult except in trivial cases.
Back to top
Arkin
Fri Mar 19 2010, 04:54PM
Arkin Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53
I will be trying more than 1 method, but i want to complete each one (this is hopefully, for a siemens fair, or Intel talent search).

I started rewriting the program, and now, before finding the matches, is filters out bad blocks based on brightness, and contrast. This did make everything faster (actually slower, but faster with GUI components disabled). I also managed to cut the memory usage by more than half.

I didn't mean object detection exactly, but "boundaries". For example, if it is scanning left to right, it continues until it reaches a boundary (a edge). It then would average all the blocks before it, set all of those blocks to the average and continue. This way, the edges would become lines separating different disparities.

Obviously, slanted planes would experience trouble, but a combination may work. I am hoping to try this, even if there is a low chance for success.
Back to top
Bjørn
Sat Mar 20 2010, 07:44AM
Bjørn Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058
Here is something to think about...

If you sum the pixels in one block, then move the block one pixel to the right, all the sums in the overlapping area are still valid. So to get the new sum you add in the rightmost new column and subtract the leftmost old column, so for an 8x8 block you only do 8+8 operations instead of 8*8. In some cases you can cache the sum of every column and end up with 8+1 operations, what is fastest depends on how well things fit in the cache and other obscure effects.

If you turn the picture 90 degrees you can use MMX/SSE instructions and do up to 16 pixels at a time.
Back to top
Arkin
Sat Mar 20 2010, 04:35PM
Arkin Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53
I was thinking about that yesterday actually, but i've got to figure out which pixel sums to actually cache with varying block steps and block resolution. Ile need to see what MMX/SSE is. Thanks!
Back to top
1 2 

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob

Go to:

Powered by e107 Forum System
 
Legal Information
This site is powered by e107, which is released under the GNU GPL License. All work on this site, except where otherwise noted, is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. By submitting any information to this site, you agree that anything submitted will be so licensed. Please read our Disclaimer and Policies page for information on your rights and responsibilities regarding this site.