Tuesday, November 23, 2010

Demystifying Google Chrome Threading Model

This post is all about technical stuff that might interest developers who are looking to enhance their thread programming skills.


May be the best way to learn is to see what others had already made in terms of proven concept. What better can this be more than learning how Google did it in its Chrome browser?


Google threading model consists of three components:
- Thread class that encapsulates the operating system threads. It is a class that abstracts Chrome browser from the operating system details. So whether it Windows, Linux or others Chrome code deals only with the Thread class.
- MessageLoop class contains queues for receiving and handling tasks. This is the way to communicate with the thread. Each Thread class contains one message loop. MessageLoop is subclassed to handle tasks in a specialized manner.
- MessagePump class, could be better named as orchestrator, controls the job of the MessageLoop. MessagePump comes with different flavors, depending on the mission it is designed for.

So a Thread object owns a MessageLoop object which in trun owns a MessagePump object. When this system runs it goes like the following:
After the tread is initialized and started, it calls its Run() method which subsequently calls the run method of the MessageLoop object which in turn calls the run method of the MessagePump.






MessagePump

The default behaviour of a MessagePump is an endless loop that passes through the following steps:
- tells the message loop to execute the pending tasks
- checks if a quit order has been issued, if this is the case quits otherwise continue
- tells the message loop to execute delayed tasks
- checks if a quit order has been issued, if this is the case quits otherwise continue
- tells the message loop to execute any idle job


MessageLoop

The MessageLoop has four queues to manage tasks. The incoming-queue that stocks any new task submitted to the thread, a work-queue that contains the tasks to be executed and a delayed-work-queue that contains tasks that should be executed sometimes in the future.
Finally the fourth queue is deferred-non-nestable-work-queue which has to do with reentrancy and will be discussed later.
All queues work in FIFO mode.

When a task is submitted to the Thread, it is inserted in the incoming-queue. The MessagePump tells the message loop to execute pending tasks, the MessageLoop checks whether the work-queue contains tasks. If it does not, all the tasks in the incoming-queue are transferred to the work-queue. If the work queue already contains tasks, the execution continues normally.
The MessageLoop pulls a task from the work-queue and verifies if it should be executed at once. If the task is delayed, meaning should be executed in the future, it is inserted into delayed-work-queue, otherwise it is executed right away.
Observers are components that might be interested in knowing what tasks are being executed in the MessageLoop. For this reason they register their intention with the message loop. The message loop notifies all observers before and after the execution of each task.

The good question to be asked in here is why to have the incoming-queue and the work-queue? Why not only have the incoming-queue alone! Actually it is all about performance. Each time a incoming-queue is accessed for input or output a lock is acquired to avoid data corruption. So in the aim to reduce the number of locks mainly by the MessageLoop, every once of a while (when the work queue becomes empty) a lock is acquired on the incoming-queue and all its content transferred to the work queue. This is way the income-queue is now available to receive more tasks while the message loop is busy executing those on work-queue.
It is worth noting that this strategy works fine when the number of expected incoming tasks is high. On the other hand if the number of incoming tasks is low there will be no performance gain since MessageLoop will waste time on locking and transferring tasks from the incoming-queue to the work-queue.

Reentrancy

Generally speaking a function is reentrant when it can be called again even when the first call did not finish yet. An example of reentrancy is recursive function.

MessageLoop is reentrant, meaning that when it is executing a task, its Run() method can be called to execute subsequent tasks even when the first one was not over yet. In this case we name the first call the outer MessageLoop and the second call to Run() the inner MessageLoop (Important: we are talking about the same MessageLoop instance. The outer and the inner qualify the first and second call to its Run() method).

Consider an example where a task opens a modal dialog box. This dialog box must be able to respond to user’s input so the message loop has to keep executing tasks! However since it did not return from the task that created the modal dialog box, it can’t proceed with the other tasks (this is the outer MessageLoop). To make this possible the modal dialog box calls the Run() method of the MessageLoop in the following way (this is the inner MessageLoop):


bool old_state = MessageLoop::current()->NestableTasksAllowed();
MessageLoop::current()->SetNestableTasksAllowed(true);
MessageLoop::current()->Run();
// Restore task state.
MessageLoop::current()->SetNestableTasksAllowed(old_state);

This ensures that the modal dialog box is responsive and the tasks are being handled by the same MessageLoop.

Tasks that run in an inner MessageLoop are called nestable tasks and have their nestable property set to true.
However if a task is not nestable this means it must be execute after the first task completes. So to make this possible, the inner MessageLoop won't execute the task but adds it to deferred-non-nestable-work-queue. When all inner message loops finish their job and are released the outer MessageLoop will be executing tasks that were pushed to deferred-non-nestable-work-queue.

This was a brief simplified overview of how threading in Google Chrome browser works. It constitutes a good start for those who scratch their heads to find a solid and proven concept to implement threading.

Saturday, November 20, 2010

How to become world class developer

Every developer dreams of reaching a world class rank. But reaching this rank is not given to everyone unless he/she is working in a big company, and on a big project. Working on small projects limits the horizons of the thoughts and hides much of the big projects complexities and approaches to the solutions.


While books are necessary to learn, they stop short of giving real experience. Books are good for getting started, learning the methodology, the object model and the APIs. No doubt this is a necessary package however it is necessary but not sufficient. No book is able to explain the details of a project and the pros and cons of every technical decision made in the course of the development.


The picture looks somehow dim for those who are not lucky enough to work in giant companies and produce softwares that are used by millions of people.


This is not completely true. There is a way of becoming top notch developer even while working on small projects. The solution exists but it is not free! It costs time and efforts. It is only for people who are passionate about developing.


The large majority of commercial softwares have their code hidden and restricted to the teams that are working on them. But there are also very important open source projects that are accessible to everyone and are used all around the world. These softwares are written by highly skilled engineers either part of their daytime job or on their free time. In both ways, these projects are “gold mines” for developers who are aiming to acquire new skills.


So the solution is to find the right open source project, one that is interesting to the person wanting to learn. Then, try to build the project and start learning the code. At first it looks like a daunting task. It might fail few times, but success comes with persistency and perseverance.


Every open source project has forums and discussion groups, but almost all of them lack complete and clean documentation. So a good approach would be to grasp the overall architecture of the project and all the available documentation that goes around. Then start looking in the code for the following points:
- Coding style: it is important to understand how the lines of codes are written because it helps improve readability.
- Tips and tricks: every module or sub module of the project brings a solution to a problem. Some are straight forward; others are trickier and utilize an indirect approach. It is crucial to learn these tricks in order to use them when facing similar problems.
- Global architecture: while tips and tricks are used on a granular level, the global architecture gives the big picture and discusses
the decisions made to solve key problems. It is worth mentioning that there is no single ideal solution, but a solution that fits certain priorities established by the project authors.


After understanding the code or at least part of it, it is important to get involved with the community that is working directly or indirectly on this project. For this reason it is imperative to visit the project forums on regular basis and actively contribute to the discussions. This will increase the understanding of the project and the reasons why some decisions are being taken.


Another major point is to discover bugs and propose solution for them. Many open source projects have policies to allow people to contribute to the project that has a mandatory path that goes through discovering and solving bugs. Once a certain number of bugs and fixes are reached, the person is allowed to become a committer which means contributing to the development of the project. When such a stage is reached, congratulations! You are now a world class developer.