Introduction: This assignment is an exercise intended to introduce you to the Frankencamera API and Nokia N900, and should get your feet wet and your brain rolling for the term project. You and your partner will implement an autofocus algorithm for N900, and in doing so, you will gain understanding of the challenges involved in interfacing with a camera and making efficient use of the available computational resource.
In fact, if you can come up with an excellent algorithm, it may just become the default implementation provided in the Frankencamera API, or in Nokia N900 in the future!
Steps:
Getting Started (PLEASE READ IN FULL): How to get set up.
Hello World!: Gain some comfort by ensuring that your build environment and device are good to go.
Hello Camera!: Compile and run the (provided) skeleton code to ensure that the camera module works. Study the code and make sure to understand what it does.
Writing an autofocus algorithm: This is the actual meat of the assignment. Have fun!
Writing a touch-to-focus algorithm: Have more fun!
Making your own scene mode: did someone say macro?
Focusing on two subjects: Need we say more?
Deliverables: If you have a pair team, all of TASKS (A)-(I) and write-up questions Q1-7 are required; if you have a singleton team, the last task and the last write-up question are optional.
Useful link:
Developing for N900 will require a cross-compilation environment, such as Scratchbox. We have created a vanilla virtual machine (loaded using VMWare Player or Fusion) that has everything installed. You can obtain the VM by the following means:
[Not recommended] If you prefer to create your own VM, or install directly on a physical machine, follow these instructions.
If you are enrolled in the course, you should have received an N900 which has already been configured with the necessary modules and updates. You can gain root privileges by entering root (password: root) in a terminal.
Managing your N900 [PLEASE READ]: The distribution of the N900's is made possible by the generosity of Nokia. Please take good care of these devices!
Tips for using your N900:
Your N900 should already have ssh installed. To find out the IP address of your N900, you may either visit sites that tell your IP, such as this, or open an X-terminal and do the following:
~ $ root Nokia-N900-XX-XX:~# ifconfigNow scroll up and find the field "inet addr". Once you know your IP address, you can log onto your N900 remotely from your own machine.
cs448a@ubuntu:~$ ssh root@XXX.XXX.XXX.XXX Nokia-N900-XX-XX:~#
You may now login to Scratchbox as follows:
cs448a@ubuntu:~$ scratchbox [sbox-FREMANTLE_ARMEL: ~] >Scratchbox is a chroot environment, meaning that the root directory has been mounted from the filesystem. The path to this directory in the VM itself is given by the symbolic link called scratchbox-root in your home directory of the VM. Likewise, the home directory in Scratchbox is pointed to by a symbolic link called scratchbox-home. Therefore,
[sbox-FREMANTLE_ARMEL: ~] lsand
cs448a@ubuntu:~$ cd scratchbox-home cs448a@ubuntu:~/scratchbox-home$ lswill return the same results. This is useful because Scratchbox does not come with a good text editor. So you will most likely be editing file on cs448a@ubuntu:~/scratchbox-home in your VM.
Running binaries built in Scratchbox is now easy. scp the file onto your N900 from Scratchbox, and run the binary from an ssh terminal. For instance (hypothetically), in Scratchbox,
[sbox-FREMANTLE_ARMEL: ~] > g++ myapp.cpp -o myapp [sbox-FREMANTLE_ARMEL: ~] > scp myapp root@XXX.XXX.XXX.XXX:/home[Note] Try to put your files in /home, as there is only limited space in the root file system. You may now ssh into your N900, change directory to /home and run your binary:
cs448a@ubuntu:~$ ssh root@XXX.XXX.XXX.XXX Nokia-N900-XX-XX:~# cd /home Nokia-N900-XX-XX:/home# ./myappYou could, of course, open an X terminal on your N900 directly, but you might prefer typing on a full keyboard and reading the console output on a full monitor.
If you have completed setting up your N900, this step is relatively easy. Write a very simple command-line program, like Hello World. Compile your code in Scratchbox, transfer the binary onto your N900, and confirm that it runs.
There are a couple of Qt examples in sample directory of Scratchbox. If you built your own Scratchbox, you can find these files here. (They are from Nokia's official Qt examples.) To build these, run qmake -project in the appropriate folder, followed by qmake XXXXXX.pro and make, where XXXXX is the name of the .pro file that resides in the respective folder. This will generate an executable of the same name. For instance, the following would work:
cs448a@ubuntu:~$ scratchbox [sbox-FREMANTLE_ARMEL: ~] > cd sample/analogclock [sbox-FREMANTLE_ARMEL: ~/sample/analogclock] > qmake -project [sbox-FREMANTLE_ARMEL: ~/sample/analogclock] > qmake -unix analogclock.pro [sbox-FREMANTLE_ARMEL: ~/sample/analogclock] > make [sbox-FREMANTLE_ARMEL: ~/sample/analogclock] > scp analogclock root@XXX.XXX.XXX.XXX:/home
Take a minute to familiarize yourself with basic Qt. You will presumably need some minimal experience with Qt to build a reasonable interface to whatever application you make, but no more than what is involved in the examples, unless you intend to use other features, such as multi-threading, event handling via signals and slots, networking, et cetera. A set of examples and their source code can be found here. There is a more serious tutorial from Nokia, released two weeks ago. They would appreciate any feedback, in fact!
In ~/sample/camera of the VM, you will find a skeletal implementation of a camera application. Compile via make and run the binary on your N900. (Note for those of you with custom VM: If you placed libFCam.a somewhere particular, then you should add the appropriate linker flags.)
|
|
At first glance, the skeletal implementation has a working viewfinder, a slider widget for setting exposure and three buttons. Play around with the slider and buttons and observe what they do. The slider should change the exopsure of the incoming frames, and the two buttons on top should change the focus setting. Both the LED flash and the shutter button are hooked up as well. The (physical) shutter button takes a snapshot and saves it on disk. However, no autofocusing takes place currently when you take a snapshot. The code is not long, and is heavily commented. Here's a quick description:
You will be making changes mostly on the two files marked in red, and possibly be adding your own files. Your task is described in detail in the following section. NOTE: You should realize that the N900 does not have controllable aperture. Also, its depth of field is not as controllable as that of your DSLR. From the images to the left, observe that when focused afar, the depth of field is very large. This application uses the Frankencamera API to interface with the camera hardware. The header files can be found at /usr/include/FCam. You will, throughout the assignment, need to refer to the full API documentation. |
Begin by skimming through main.cpp and MyCamera.h/cpp to understand the control flow of the skeleton.
Examine MyCameraThread.h/cpp, especially MyCameraThread::run(). You will find that the thread, once launched, begins a loop that processes incoming frames. In the Frankencamera API, one calls methods of FCam::Sensor in order to ask that the sensor capture frames with certain parameters. The sensor will make these frames available as they materialize. They are fetched by calling FCam::Sensor::getFrame(). Each frame will have been tagged with relevant metadata.
In the beginning of this loop, you will see a nested loop for event handling. Here you are informed of any events that take place. You can see that there are stubs for handling shutter presses, marked by TODO's.
Past the event-handling loop, you will see a branch for requesting a snapshot.
After the branch, there is a loop for consuming frames. It may be that the sensor has generated more than one frame in the meantime. The purpose of this loop is to consume all the frames to keep the buffer flowing. We will only display the latest frame on screen (see next item in the list), but the other frames so far may contain useful information.
Near the end of the loop, the thread calls MyInfoWidget::processFrame(...) and MyViewfinderX::update(). This tells the respective widgets to refresh. You can treat MyViewfinderX as a black box if you would like. However, go take a look at MyInfoWidget::processFrame(...) and MyInfoWidget::paintGL(...). There you can have the widget display some useful information. (Currently it draws some lines on screen.)
Write-up:
When you tap on the screen, you will see that the application prints out the location of the tap onto screen. (You should be running the application over ssh for this to be visible to you immediately.) Examine MyViewfinderX::mouseReleaseEvent(...). You will see that it emits a signal, which is connected (see MyCamera::MyCamera(...)) to MyCameraThread::FocusRequest(...)
Many consumer cameras sport a number of scene modes (portrait, macro, sports, night, landscape, et cetera) and specialize their internal algorithms to the user's choice of modes. Design a special autofocus algorithm for one such scenarios. You may come up with your own mode if you'd like (group photo, horizon, sunset, ...). The autofocus algorithm do not have to differ considerably from your default implementation.
Write-up:
Suppose now that we wish to specify the depth of field precisely, i.e. there are two objects in the scene at two different depths, and we would like to focus on everything in between.
TASK (H): There are two given objects at distance d_1 and d_2 from the lens. The focal length is f; the maximal allowable diameter of the circle of confusion is C. The photographer may control the f-number N (aperture) and the object distance U (focus). What values should he use for N and U, if he wishes the depth of field to span exactly the two objects?
TIP for TASK (I): Given two taps by the user, finding d_1 and d_2 is not an easy problem. It is called in literature "depth from focus" or "depth from defocus." An alternative approach is to choose a focus setting that makes the two objects equally sharp. This would fail if the two objects have different amount of texture, however, so you may want to focus on them individually to determine how sharp each of them is when in focus, and use this information as a normalizer. If you have other ideas, feel free to discuss them with the staff and/or implement them. We honestly do not know how well it will work on the N900, so it is not imperative that it works well, but be sure to approach the problem with rigor.
NOTE: Task (I) and Q7 are optional for singleton teams.
Write-up:
Hints:
You may implement the features in separate applications, if you find it more convenient.
You may consider creating a MyAutoFocus class which supports function calls like initialize(), terminate(), getBestFocus(), processFrame(FCam::Frame::Ptr f), et cetera, so that you may aggregate all your edits into your own file(s).
As you can figure out from the two button widgets, you can set the focus of the lens by calling FCam::Lens::setFocus(...). For instance,
m_lens.setFocus(m_lens.farFocus(), -1); // focus as far as possible m_lens.setFocus(m_lens.nearFocus(), -1); // focus as close as possibleSee FCam/Lens.h to learn what the two arguments do. This header also contains other useful function calls to the lens object.
Remember that moving the lens is not instantaneous. (In fact, you might even want the lens to slow down and not move at its maximal speed.) If you call FCam::Lens::setFocus(...), it might be a while until you get back a frame with the desired setting. The sensor will continue streaming while you are moving the lens!
Devices that are attached to the sensor will automatically tag every frame with metadata. This tells you what settings were used to take a particular frame. For instance,
FCam::Frame::Ptr f; f = sensor.getFrame(); Lens::Tags * t1 = f->tags(&m_lens); fprintf(stdout, "The average focus setting during the frame: %f\n", t1->focus);(See the documentation for a list of available parameters.) If you find that the frame you have just read turns out to be sharp, you might want to set the focus setting of your snapshot to be the same as this one.
N900 has hardware support for measuring sharpness in parts of the image. This is already activated in the code as follows:
m_shot.sharpness.enabled = true; m_shot.sharpness.size = FCam::Size(16, 12);This effectively creates a 16x12 grid of autofocus points. You can query the sharpness "score" of each region by doing the following:
FCam::Frame::Ptr f; f = sensor.getFrame(); fprintf(stdout, "%d x %d\n", f->sharpness.size.width, f->sharpness.size.height); // should print out 16 x 12 fprintf(stdout, "Sharpness score at the top-left corner: ", f->sharpness(0,0) >> 10);This is the recommended method of measuring sharpness, as dealing with raw pixels in software is not as fast.
A naive algorithm might go from the nearest focus to the farthest focus in some number of steps, asking the sensor to continuously stream, and calculate some score on each frame that comes back, remembering the focus setting of the best frame. Of course, this is not particularly robust...
You can access the raw data of a frame by calling f->image.data. It returns, by default, an image in 8-bit UYVY format; in each row, every odd-numbered pixel will have U and Y channel available, and every even-numbered pixel will have V and Y channel available. Finally, the raw data is in a row-major form (a 1-dimension representation in which every row has been concatenated together. Below is a code snippet for a simple grayscale conversion:
unsigned char * myimage = new unsigned char[width*height];
for (i=0;i<height;i++) {
for (j=0;<width;j++) {
myimage[i*width+j] = f->image.data[(i*width+j)*2+1];
}
}
Currently pressing the shutter takes a 5MP shot and saves it as a JPEG file in /home/user/MyDoc by calling FCam::AsyncFileWriter::saveJPEG(...). However, the color balance is not very good currently. If you want, you can replace this with FCam::AsyncFileWriter::saveDNG(...) to save the RAW data instead as a DNG file, and convert them using dcraw or Adobe Camera Raw into 8-bit formats.
You can grab the screen by pressing Ctrl-Shift-P on your N900. Unfortunately, as MyViewfinderX draws directly on screen, its content will not be captured. The screenshots are saved in /home/users/MyDoc/.images/Screenshot.
[Advanced] N900 features an ARM Cortex-A8 processor, which has a NEON SIMD unit. If you are doing serious pixel math (like convolution), then consider using NEON intrinsics for speedup. x4 improvement (over CPU) is typical, as long as you are not overloading the unit. This advice applies to the term project especially.
Consult the class newsgroup su.class.cs448a if all else fails.
My N900 complains of memory shortage!: N900 limits the amount of storage used in the root folder. You might want to put things in /home, especially large images and file. Run df -h to see how much of the filesystem you are using.
g++ cannot find header files I wrote!: This is most likely a problem associated with qmake. See here for a tutorial.
I can't ssh into my N900!: ping it to see if it is visible from your machine. If you are behind a NAT, you may have to set up port forwarding.
OpenGL API causes compiler errors!: Use OpenGL ES 2.0 API only. You do not have the full OpenGL API at your disposal.
How do I use my N900 as a storage device, for fast file transfer?: Connect to a computer via the USB cable and select "Mass Storage" mode. By default, /home/user/MyDoc is mounted as the root of the drive. It is also possible for the VM to provide ethernet connection via the USB... instructions may be added in future.
My application is frozen!: You can press the power button to bring up an interface that includes a "Quit this application" button. Alternatively, you can ssh into your N900 and kill the process.
I can't run the VM!: You will need the latest player (3.0.0).
When I run the camera application on desktop, or open the lens cover, I get an "Operation Failed" message!: This is entirely normal. We have replaced the camera modules, so the built-in application will fail to operate. (Remember, we are writing our own camera application!)