4. 4
Raw Data: Top Down Video Classified Umbrellas
Number as Function of Time
Umbrella Processing
5. 01 02
03
How to collect the data we want
practically from raw video, erode
components to shrink to points
Collecting Data
Tracking multiple objects in
unstructured scenes is a challenge
Tracking Multiple objects
Evaluation of human swarms
performance
Analyzing Data
5
9. Centroids overhead on raw video
umbrellas' position data was first
collected and stored, then it can be
used in a Kalman Filter to track the
motion paths
9
K-means Applying
13. 03 1. Response Time to a Vocal Command
2. Human Swarm’s Learning Rate
3. Distribution Consensus Analyzing
4. Accuracy Comparing
5. Shape Match Ability
6. Position Memory
7. Simulation
13
Seven Experiments
15. 01
15
Ratio of Red Umbrellas
Ratio of Blue Umbrellas
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑀𝑎𝑗𝑜𝑟 𝐶𝑜𝑙𝑜𝑟
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐴𝑙𝑙 𝐶𝑜𝑙𝑜𝑟𝑠
Response Time to Command
t = 4.8 s t = 3.6 s
32. 01
(1) We initialize 200 points on a 2D space, have them each randomly choose a color
red, green or blue
(2) In a for loop, iterate until convergence. For each point, find K nearest neighbors
comparing their colors and then choose major color
32
YPosition
X Position
(1) (2)
Simulation
YPosition
X Position
33. 01
Above plot shows results from 100 simulations. The black line shows the average time
required for convergence with K = 11 33
Simulation
Iteration Times
RatioofMajorColors
Mean Plot for K=11
Mean+Std Plot for K=11
Mean-Std Plot for K=11
34. 01
Above plot shows results from 100 simulations. The black line shows the average time
required for convergence with K = 11 34
Simulation
35. 01 Overlap umbrellas segmentation, multi-objects vision tracking
35
02 K-means data clustering, data collecting
03 Human swarm response to simple vocal command
04 Human swarm“Shape-matching”ability
05 Evaluate position memory
06 200 nodes simulation
Conclusions
Editor's Notes
Hello everyone, thank you for attending my Master Thesis Defense, now let’s begin the presentation.
My thesis research is called “Metrics on crowd control with overhead video and vocal commands”, which is an interesting research project for me.
It is based on an interesting video, as you can see from the image from the screen, a beautiful capture from the video.
Totally speaking, this thesis presents an agent-tracking framework for semi-structured crowded video
This framework is used to investigate how large quantities of people respond to vocal commands with local feedback and an overhead video.
This video showing an overhead view of more than 200 people each holding an umbrella equipped with red, blue, and green LED lights
There is a director on an elevated platform announcing commands. And over head camera view projected on a large screen, the participants were devided into several groups according to their major, gender, or grade
Special thanks to professor Daniela Rus, Daniela Rus is a professor at MIT, all data I analyzed is in the video made by professor
So what we get and analyzed from this video?
We can see a representative screenshot, the results after we classified umbrellas, and a plot showing umbrella color counts as a function of time.
At one frame, all may be the same color, but later the all umbrellas may directed to change to another color or to form different shapes in various colors.
Umbrellas in the video are not moving aimlessly, they change colors rapidly and often overlap
So the first challenge is to collect the data, to segment individual umbrellas. The solution employed is to erode all components to shrink to points
Automatic track how the human swarm moves is a part of our goal, that is not easy either
Then in particular, we want to see more details of human swarm’s performance. Like what’s the response time to a simple vocal command, how accurate it is when directed to make a movement, what’s the learning rate of human swarm, how good is the position memory
Besides, for a more complete thesis research, we need do some simulation
The first step is to identify the umbrellas, and record their positions However, both the numbers and positions of umbrellas are not a constant, this number changes as umbrellas enter and leave the field of view, or lose battery power.
Here we apply K-means clustering to verify the centroids of each object. Data clustering is frequently used in many fields, including data mining, pattern recognition, decision support, machine learning and image segmentation
The aim of the K-means algorithm is to get more accurate umbrellas’ position data.
Given a set of n data points in real d-dimensional space, 𝑅 𝑑 , and an integer k, the problem is to determine a set of k points in 𝑅 𝑑 , called centers, so as to minimize the mean squared distance from each data point to its nearest center
Here 𝑥 1 ,…, 𝑥 𝑛 =𝑋 is the data matrix, 𝑚 𝑘 = 𝑖∈ 𝐶 𝑘 𝑥 𝑖 𝑛 𝑘 is the centroid of cluster 𝐶 𝑘 , and 𝑛 𝑘 is the number of points in 𝐶 𝑘
Equation (1) is used for assign objects, each 𝑥 𝑝 is assigned to exactly one 𝑆 𝑖 𝑡
(2) is used to calculate new means to be the new centroids of the observations in the new clusters
The first step is to identify the umbrellas, and record their positions However, both the numbers and positions of umbrellas are not a constant, this number changes as umbrellas enter and leave the field of view, or lose battery power.
Here we apply K-means clustering to verify the centroids of each object. Data clustering is frequently used in many fields, including data mining, pattern recognition, decision support, machine learning and image segmentation
The aim of the K-means algorithm is to get more accurate umbrellas’ position data.
Given a set of n data points in real d-dimensional space, 𝑅 𝑑 , and an integer k, the problem is to determine a set of k points in 𝑅 𝑑 , called centers, so as to minimize the mean squared distance from each data point to its nearest center
Here 𝑥 1 ,…, 𝑥 𝑛 =𝑋 is the data matrix, 𝑚 𝑘 = 𝑖∈ 𝐶 𝑘 𝑥 𝑖 𝑛 𝑘 is the centroid of cluster 𝐶 𝑘 , and 𝑛 𝑘 is the number of points in 𝐶 𝑘
Equation (1) is used for assign objects, each 𝑥 𝑝 is assigned to exactly one 𝑆 𝑖 𝑡
(2) is used to calculate new means to be the new centroids of the observations in the new clusters
Just like the image shown, we have marked every umbrella’s center.
we analyzed the raw video every 15th frame to get the data we want. In the first frame, each umbrella’s centroid was manually marked and those centroids were used as seeds for the next frame.
In the next frame, K-means was used refine the seeds’ position to ensure they are in the middle of each umbrella. Then we have accurate position data.
Once we have the accurate positions’ data, we are able to track them in further step.
Centroids overhead on raw video data umbrellas’ position data was first collected and stored, then it can be used in a Kalman Filter to track the motion paths.
Just like the image shown, we have marked every umbrella’s center.
we analyzed the raw video every 20 frame to get the data we want. In the first frame, each umbrella’s centroid was manually marked and those centroids were used as seeds for the next frame.
In the next frame, K-means was used refine the seeds’ position to ensure they are in the middle of each umbrella. Then we have accurate position data.
Once we have the accurate positions’ data, we are able to track them in further step.
Centroids overhead on raw video data umbrellas’ position data was first collected and stored, then it can be used in a Kalman Filter to track the motion paths.
This image shows of x and y position as a function of time with the lines drawn in the correct colors of the umbrellas, in the plot we show the changement of multiple umbrellas’ positions and colors clearly.
It is a period since the beginning of the video till 800 frames.
In this section, using the Kalman Filter algorithm tracking umbrellas. It is a recursive algorithm so that new measurements can be processed when they arrived, then a new round of calculating begin
We chose Kalman filter because It can filtering out the noise during the time finding out the best estimate data, and a Kalman Filter not only just clean up the data measurements, but also projects those measurements onto the state estimate.
A represents the state transition matrix
U is o
B represents the input matrix, which is optional
H represents the observation vector matrix
K represents the Kalman Gain
P represents the estimate covariance
Z measurement results
After that we are able to identify each umbrella at the video, get the accurate position data of umbrellas,
We are able to track multiple umbrellas know where they are going, but that is just a few step of our thousand miles, for next, I’d like to share 7 experiments during the period I doing the thesis project
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
In the video, there are several vocal commands which required people to change their umbrellas’ color.
Commands included “I want everybody to turn them red”, and “Now turn the red off. Turn the blue on!” This test analyzes how people responded to eight color change commands.
Those vocal commands mentioned above can be seen as a serious of basic vocal commands
Eight vocal color-change commands were recorded. All data is aligned so the command begins at t=0s. For example, after the vocal command “I want everybody to turn them red,” people began to switch their umbrellas’ color at t=1s, and at t= 7 s, 90% of the umbrellas’ color changed to red.
We define the response time as the time of the first response to when more than 90% of the umbrellas’ color changed.
In this case, the time constant is 6 seconds. During successive color change commands, it takes less time for 90% of the swarm to turn their umbrellas to one color. This means the swarm’s performance is increasing.
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
For simple color-change vocal command, people were able to achieve the goal in a short time. This section analyzes the time response to more complex commands,
For example, at time 01:23, the vocal command was “When I say go I want you to turn them on and I want this whole group, this group that's gathered tonight to be one color but I'm not going to tell you what color that is.”
So actually we will see how long exactly it will take human swarm to accomplish the vocal command they heard
This experiment is a classic distributed consensus problem. In this experiment, all people in the crowd must adjust their own color with their neighbors, but since the vocal command, is not specific on which color they need to turn, the process takes about 10 seconds. For this analysis, color umbrellas’ amount is changing every frame, then we can find out which color the human swarm going to change. At the same time we can get the ratio of major color. An exponential function is fit to the data, giving 1-e-0.32 t.
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
For next, let’s comparing the accuracy
Comparing the accuracy of three command
We define in every 20 frames, if the umbrella’s new position is more than a quarter of its radius, this umbrella is moving
Through this figure, we can see the difference between the red’s velocity and blue’s velocity. Before the vocal command “Go” red umbrellas are moving, blue umbrellas are not moving, after that the command, reds freeze and blues move instead.
It also shows the swarm accuracy increased, we get the exponential function fit for those three vocal commands, ratio for “red move” is 0.92(1− 𝑒 −1.61𝑡 ) for “red freeze blue move” is 0.95(1− 𝑒 −1.06𝑡 ) which is same with “green go.”
For a series of similar command, human swarm respond faster, perform more accurate than before
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
Comparing to the color-change vocal command, Later in the video, the human swarm was given harder commands including to form circles.
There are three circles with different colors, values closer to one means the swarm shape is closer to a true circle, We calculated the circularity of the human swarm when commanded keep circling around.
at t=485 second, the human swarm was given the vocal command and then they began to move. The human swarm followed this command till t=540 seconds, when a new command was announced. During this time the, circles became increasingly round.
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
In this small section I will show the results how human swarm performs “snakes”.
The human swarm was told to form a “snake”, which means they were divided into three groups based on their colors, and required to connect with their neighbors to move like a snake.
For this experiment, a snake is defined as: there should be at least five same-colored umbrellas, they are connected with each other one by one, where each successive umbrellas is within six umbrella radius of a neighbor.
At time t=414, there are two green snakes with 25, 35 people, one 19-member red snake, two 33, 28-member blue snakes, and 18 undefined people who are not in any snake
The human swarm began to form snakes at the moment they heard the vocal command, at t= 385 seconds, and this command completed at t=435 seconds. During this period, the number of people in snakes are increasing, while the number not in a snake decreased. At the end there are four snakes in the image.
As shown in figure, we have the number of umbrellas in the snakes as a function of time, and how many umbrellas were not in a snake
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
For a serious of change color command, human swarm was asked to change colors eight times, this figure displays the time required for 80% of the swarm to achieve desired value.
At No.1 trial, human swarm took about 4.6 seconds to reach consensus, next trial took about 3.5 seconds, till last trial, response time is less than 3 seconds.
This proves for color-change vocal command, human swarm respond time tends to reduce, demonstrating that the swarm is learning
To display the result we got more directly, we made “circle fit” for the bullseye.
Human swarm did form four circles according to different colors, then we draw the circles best fit them.
As show from the picture, both the best fit circle and error lines are marked on the raw video capture, error lines response the distance between umbrella’s center and the boundary of best fit circle
Based on those two plots we have, We can draw the conclusion that shape-matching command is harder, human swarm did not improve their performance in this period
In the overhead video, at t =613s the human swarm was directed to form a “Bullseye”. To evaluate their performance, circularity was again evaluated using the equation 𝐶 2 /4𝜋
As we can see from the picture, the outside blue and green circle are more stable and closer to a real circle relatively, circularity is around 1.1 – 1.2
But the yellow and red groups did not perform as well as blue and green. Their circularity is far away from 1 and not stable either.
We have more than one standard to judge human swarm’s performance, here we calculate the mean distance between every two cirlces’ center, the distance for a good bullseye should be small enough, or fully overlapped, In a perfect bullseye, all circles’ are concentric.
We used equation r x − g x 2 + r y − g y 2 calculate the mean distance between centers, for example, 𝑟 𝑥 , 𝑟 𝑦 is x, y position of red circle, 𝑔 𝑥 , 𝑔 𝑦 is x, y position of green circle.
We can see from the image, the distance between each two circles as function of time, from the initial state till the command finished, distance between centers does not reduce too much
To display the result we got more directly, we made “circle fit” for the bullseye.
Human swarm did form four circles according to different colors, then we draw the circles best fit them.
As show from the picture, both the best fit circle and error lines are marked on the raw video capture, error lines response the distance between umbrella’s center and the boundary of best fit circle
Based on those two plots we have, We can draw the conclusion that shape-matching command is harder, human swarm did not improve their performance in this period
This experiment analyzes the accuracy of human swarm when was commanded to return back to known position.
So we want to see how good human swarm’s memory is.
How to evaluate memory?
It compares the mean distance between everyone’s “original position” and the “returned position”. Smaller distances indicate better memory. When they heard the vocal command “When I say go I want you to move.”
from t=539, the human swarm spread out slowly, and at t=590 they were commanded to return to their original position. During this period, the distance between current position and original position is increasing, which reflects the reality.
After that moment, human swarm began to go back, so the distance is decreasing. Finally till t=605 before next command, the distance is almost same with original one.
So human swarm has a good position memory as we analyzed.
200 agents are randomly placed on a 2D region, each agent initially selects a color from the red, green or blue. And we want all these agents agree on one same color finally, but it doesn’t matter what color it is. At each turn agents check the current color of their k-nearest neighbors.
We need all agents to select the same color, during the process, every time agents turn the color, it is synchronized.
No deterministic algorithm exists, so in our algorithm each node selects it’s next color from a probability distribution weighted according to the colors of its K-nearest neighbors
We did the simulation for 100 times, to get a plot showing ratio of major color as a function of time. The ratio of major color is improving stably.
We did the simulation for 100 times, to get a plot showing ratio of major color as a function of time. The ratio of major color is improving stably.
The experiment is an exploration of the power of groups and the idea that groups are more capable than the sum of their parts.
The Umbrella Project allows the exploration of theories on collaboration in the context of crowds and enables the extraction of hypotheses for future biologically-grounded approaches to robot control.