More 3D reconstruction Q's / General Science and Electronics / Forums

Forums

4hv.org :: Forums :: General Science and Electronics

« Previous topic | Next topic »

More 3D reconstruction Q's

1 2 next

Move Thread

LAN_403

Arkin

Wed Mar 17 2010, 01:16AM

Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53

For those of you who aren't aware, i am working on a project for 3D reconstruction. Main info here and early progress is here:

This is the current result (approx) of the disparity map:

Recently some people at blenderartists made me a script to import points from a text file into Blender for a point cloud. Right now, no real distances are calculated, but just directly related to the displacement value.

These are my questions:

1) In images, further objects are smaller than real life. Do i/ should i compensate for this by using the distance value to calculate the real size? How would that calculation look like? Where would the real sized object be in terms of (x,y)? Should it be scaled with the center the same?

2) Right now, i am just using the left image and building a disparity map on top of it, so objects and map elements align. Should i be using the middle of the 2 detected instances on each image instead? For example:
l.image: ---x----------
r. image: -----------x--

so the calculated displacement is the different (absolute), or 7 units. Normal, the disparity map would be at the left image location. Instead, should i displace it to the right 3.5 units (7 divided by 2)?

3) How would i go about calculating "density", so small patches or pixels are filtered out? Should i simply do a minimum area with a flood fill?

I also had another though on matching the two images (image correspondence). Right now, the Sum of Absolute Differences works to an extent, but has a major limitation of size of the objects detected. If you do go lower, major errors become apparent.

Instead, what if it grabs pixel (x,y), and links it to reference points, say (x+3, y-3) among others. When searching for that "network" it would have to link a certain amount of points at a certain threshold.

The reasons i am hesitating in trying this is:
1) It's complicated. What happens if the reference points don't exist (off the image)? Should i have "backup references"? Random indexed references?

2) Speed. Having so , many networks makes a huge amount of data. For example, a 640x480 picture. That is 307200 pixels. With only 5 reference pixels, which i think is low, that becomes 1536000 points. Not only memory wise, the computer has to search all of these points, search neighboring pixels within the threshold, and so on. You go up 1 threshold point, you go up multiple search areas.

3) Color at the pixel level is very reliable. It may work with enough reference points, than you run into problem 2. Instead of color, you could match the points link's links, for however many levels you want to follow the recursive tree. This would also be computationally intensive, at an exponential rate. That would also be hard for me to code so much recursiveness.

Disregarding speed, do you think using a network of reference points would work?
Regarding speed, do you think this kind of computations is suitable for GPU computing (CUDA or OpenCL)?

Sorry for the big wall of text, trying to be descriptive.

EDIT: Do you think having a network of computers calculate would help, or would internet speed (LAN) just limit it? They all share a public drive, so there is no need to transmit images, but which parts to match up, and transmit back the coordinates to a master.

Bjørn

Wed Mar 17 2010, 03:15AM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

This is a lot of stuff...

If you want to render your data you need to take into acount the distance from the camera, wikipedia will tell you all about 3D transformations in a completely unreadable way. You got a camera, why don't you just take a few pictures at different distances and work out the simple formula?

You can speed up the calculations 1000 times by improving your algorithm, I doubt you can get 1000 fast PCs so think about that after everything is working fine. The same with CUDA, I have done some work with it and the speed up is nowhere close to the claims when you compare fairly (optimized program vs optimized program). You will get a faster program that only works on some PCs and will be slower than an optimized original on some of the ones it works on.

You need to think more about your requirements because you are wasting time by not focusing your work on the most important areas. For example the distance between your two cameras will affect your accuracy significantly. Move the cameras closer and everything becomes much simpler. It is realistic to get good subpixel resolution between objects that are not very small so working on that might be a much more efficient way to spend your time. Same with variable distance, unless you got some secret requirement for a fixed distance.

Carbon_Rod

Wed Mar 17 2010, 09:04AM

Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155

Great tool for raw data:

a 3D scanner built from a camera & computer projector may work better.
google: structured light scanner

Cheers,

Arkin

Wed Mar 17 2010, 11:15PM

Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53

I have a hard time believing that CUDA woudln't make much difference. Some applications dont gain much, but this is extremley repeititive and recursive. Instead of matching 1 block at a time, you could mach more than 100 at a time. The main speed limiting will be transfering the data to the video ram.

Also, how would having the cameras closer make it more accurate? I purposely did it far for a recognizable change.

I do have access to 60+ computers at my school, each dual core. But yes, i would rather avoid that complexity (networking, job management, etc)

Bjørn

Thu Mar 18 2010, 01:50AM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

CUDA is not like having hundreds of separate processors, it is several arrays of ALUs where all units in an array stall unless they all can follow the same code path. The largest speed up I have gotten is 4x when compareing CUDA to a pretty optimal program running on all cores on my PC. A comparison made by NVIDIA would have run it unoptimised on one core and claimed a 50x speedup.

This searching will run particularly well on a GPU since each search operation can be done completely independently. On the other hand a CPU can do it more efficiently because it can filter out blocks in ways that a GPU can't since the GPU must treat each block the same to avoid stalling a large group of ALUs. If you do it right almost all all the data will come out of the cache of a CPU so a GPU can't use the advantage of a huge memory bandwidth. So the CPU will gain back some of the GPU advantage and the speed up will not be able to compete with the improvements a better algorithm will bring.

Moving the cameras closer reduces the rotation of the objects that happen when you view them from a different angle, that makes it easier to find an exact match for your block. It also greatly reduces the number of pixels that are only visible in one picture. This increased signal to noise ratio can be used to make your program faster and more accurate.

Arkin

Thu Mar 18 2010, 12:12PM

Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53

Right now, each block is being treated independently, just on 1 core. Every block does go through the same code.

What ways of filtering are you talking about? I was simply going to calculate the disparities, and go through the resultant array and then filter it out.

I've been expirimenting with closer cameras, and it works well, but i cannot go into larger search block sizes (overlaps then). This is resulting in a noisier final product, with lots of disparities "mixed in".

I am thinking it is now based on filtering, since the SAD algorithm is fairly simple. I plan on also trying the network of reference pixels.

Also, I was going to try and use edge detection. With edge detection, i can make several bounding areas, which are essentilly the bounds of the object. If the bounds areas is less than x (so it's not big like the floor), then all of the disparities inside will be the same.

The problem i have been having with that is the pictures aren't perfect, so i am getting gaps in the lines. Whats a good way to "guess" on where missing line segnments go, based on the surroundings?

Bjørn

Thu Mar 18 2010, 02:49PM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

I mean that you can save a lot of time by filtering out blocks that are not worth processing. For example you can check a single pixel and see that it is entirely the wrong colour, the contrast might be so low that any results would be meaningless. Another example, there are several fast methods for finding the brightness of a block and you can skip comparisons for the blocks that have too different brightness. The list of fast methods to filter out hopeless blocks can be quite long. Searching in the frequency domain is also faster.

SAD is not the most accurate way to compare blocks. See for example:

I suggest you try a video compressor and enable motion compensation debugging output and see how well it does on your pictures. Many possibilities exist, for speed a hybrid method that uses different methods depending on the nature of the block might be the most efficient.

Object detection is very difficult except in trivial cases.

Arkin

Fri Mar 19 2010, 04:54PM

Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53

I will be trying more than 1 method, but i want to complete each one (this is hopefully, for a siemens fair, or Intel talent search).

I started rewriting the program, and now, before finding the matches, is filters out bad blocks based on brightness, and contrast. This did make everything faster (actually slower, but faster with GUI components disabled). I also managed to cut the memory usage by more than half.

I didn't mean object detection exactly, but "boundaries". For example, if it is scanning left to right, it continues until it reaches a boundary (a edge). It then would average all the blocks before it, set all of those blocks to the average and continue. This way, the edges would become lines separating different disparities.

Obviously, slanted planes would experience trouble, but a combination may work. I am hoping to try this, even if there is a low chance for success.

Bjørn

Sat Mar 20 2010, 07:44AM

Registered Member #27 Joined: Fri Feb 03 2006, 02:20AM
Location: Hyperborea
Posts: 2058

Here is something to think about...

If you sum the pixels in one block, then move the block one pixel to the right, all the sums in the overlapping area are still valid. So to get the new sum you add in the rightmost new column and subtract the leftmost old column, so for an 8x8 block you only do 8+8 operations instead of 8*8. In some cases you can cache the sum of every column and end up with 8+1 operations, what is fastest depends on how well things fit in the cache and other obscure effects.

If you turn the picture 90 degrees you can use MMX/SSE instructions and do up to 16 pixels at a time.

Arkin

Sat Mar 20 2010, 04:35PM

Registered Member #2140 Joined: Tue May 26 2009, 09:16PM
Location:
Posts: 53

I was thinking about that yesterday actually, but i've got to figure out which pixel sums to actually cache with varying block steps and block resolution. Ile need to see what MMX/SSE is. Thanks!

1 2 next

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob