Technical Anatomy of a Caller ID Android App

•Download as PPTX, PDF•

3 likes•988 views

At the 2013 Grace Hopper Conference in Minneapolis, MN, one of our mobile engineers, Kristine Delossantos, gave a lightning talk about the technology behind Current Caller ID. @WhitePages

Technology Business

2013
Technical Anatomy of a
Caller ID application
Kristine Delossantos
Oct 3rd’2013
#GHC13
1:17 PM
2013

2013
Outline
 Overview of WhitePages Current Caller ID
 Technical Architecture
 Key Problems and Solutions

2013
Sweet Call Alert
Know who’s calling and
what is happening with
them real-time

2013
One List, One Touch
 Consolidated call/text log
 One tap easy access to
top contacts

2013
Make It Visual
 Sharable insights into communication style
– who when and how.

2013
How It Works
Meet Spongebob…
He just got a new phone,
Installed Current,
And wired it to Facebook!

2013
Spongebob’s Friends Are Excited!
They want to celebrate and get Krabby Patties
together,
So they text him about it

2013
Technical Architecture
Active MQ
Contact
Graph
Store
WhitePages
Mobile
Service Front
Ends
Entity
Resolution
System
Data
Collection
Services

2013
Keeping Data Fresh
Network variance
Data connections
Usage Plans
Push/Pull protocols
Our solution:
• We periodically update the data on a schedule in
the background, in batch.
• Active MQ & worker machines

2013
Data Transfer
Our solution:
Thrift over Http and we only
deliver objects since the last
successful request.
ThriftJSON
Serialized Contact List Size
Comparison
GZip
Thrift
HTTP
Updates

2013
Storage Solution
Engineering costs
Operational costs
Postgres
Our solution:
We settled on Postgres and treat it as a NoSQL key-
value store. This saved engineering time as well as
costs.

2013
Entity Resolution System
Machine learning
Tunable
Performance

2013
Developing Mobile Applications
Carrier variance
Test matrix
Device variance
Platform solutions

2013
Got Feedback?
Rate and Review the session using the
GHC Mobile App
To download visit www.gracehopper.org

Similar to Technical Anatomy of a Caller ID Android App

HIGH SPEED DATA RETRIEVAL FROM NATIONAL DATA CENTER (NDC) REDUCING TIME AND I...IJCSEA Journal

Big Data and User Segmentation in Mobile ContextInMobi Technology

K anonymity for crowdsourcing databaseLeMeniz Infotech

Offline and Online Bank Data Synchronization Systemijceronline

Stream me to the Cloud (and back) with Confluent & MongoDBconfluent

Vital.AI Creating Intelligent AppsVital.AI

Alexander Cahill ResumeAlexander Cahill

Simply Business' Data PlatformDani Solà Lagares

Socket programming assignmentRavi Gupta

The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York CityNeo4j

Managing environmental- molecular- and associated meta-data: The Micro B3 Inf...Renzo Kottmann

PpppppppptttttttttttttttttttttRahul kulshrestha

Nishant_CVNishant Kumar

Artificial Intelligence (ML - DL)ShehryarSH1

Smart App@Pivotal by Dat TranVMware Tanzu Korea

Knowledge Matters Issue 15 - Technology at ConcernEllen Ward

Mining Stream Data using k-Means clustering AlgorithmManishankar Medi

Google Apps in Legal Aid - Part 1Legal Services National Technology Assistance Project (LSNTAP)

Guarding Fast Data Delivery in Cloud: an Effective Approach to Isolating Perf...Zhenyun Zhuang

Khude Barta - Online Messaging ApplicationArman Hossain

Similar to Technical Anatomy of a Caller ID Android App (20)

HIGH SPEED DATA RETRIEVAL FROM NATIONAL DATA CENTER (NDC) REDUCING TIME AND I...

Big Data and User Segmentation in Mobile Context

K anonymity for crowdsourcing database

Offline and Online Bank Data Synchronization System

Stream me to the Cloud (and back) with Confluent & MongoDB

Vital.AI Creating Intelligent Apps

Alexander Cahill Resume

Simply Business' Data Platform

Socket programming assignment

The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York City

Managing environmental- molecular- and associated meta-data: The Micro B3 Inf...

Pppppppppttttttttttttttttttttt

Nishant_CV

Artificial Intelligence (ML - DL)

Smart App@Pivotal by Dat Tran

Knowledge Matters Issue 15 - Technology at Concern

Mining Stream Data using k-Means clustering Algorithm

Google Apps in Legal Aid - Part 1

Guarding Fast Data Delivery in Cloud: an Effective Approach to Isolating Perf...

Khude Barta - Online Messaging Application

Recently uploaded

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

ICT role in 21st century education and its challengesrafiqahmad00786416

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea

CNIC Information System with Pakdata Cf In Pakistandanishmna97

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Corporate and higher education May webinar.pptxRustici Software

Manulife - Insurer Transformation Award 2024The Digital Insurer

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

MS Copilot expands with MS Graph connectorsNanddeep Nachan

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

Why Teams call analytics are critical to your entire businesspanagenda

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Ransomware_Q4_2023. The report. [EN].pdfOverkill Security

Recently uploaded (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Apidays New York 2024 - The value of a flexible API Management solution for O...

ICT role in 21st century education and its challenges

FWD Group - Insurer Innovation Award 2024

Strategies for Landing an Oracle DBA Job as a Fresher

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

CNIC Information System with Pakdata Cf In Pakistan

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

presentation ICT roal in 21st century education

Corporate and higher education May webinar.pptx

Manulife - Insurer Transformation Award 2024

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

MS Copilot expands with MS Graph connectors

Boost Fertility New Invention Ups Success Rates.pdf

AWS Community Day CPH - Three problems of Terraform

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Why Teams call analytics are critical to your entire business

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Ransomware_Q4_2023. The report. [EN].pdf

Technical Anatomy of a Caller ID Android App

1. 2013 Technical Anatomy of a Caller ID application Kristine Delossantos Oct 3rd’2013 #GHC13 1:17 PM 2013

2. 2013 Outline  Overview of WhitePages Current Caller ID  Technical Architecture  Key Problems and Solutions

3. 2013 Sweet Call Alert Know who’s calling and what is happening with them real-time

4. 2013 One List, One Touch  Consolidated call/text log  One tap easy access to top contacts

5. 2013 Make It Visual  Sharable insights into communication style – who when and how.

6. 2013 How It Works Meet Spongebob… He just got a new phone, Installed Current, And wired it to Facebook!

7. 2013 Spongebob’s Friends Are Excited! They want to celebrate and get Krabby Patties together, So they text him about it

8. 2013 Technical Architecture Active MQ Contact Graph Store WhitePages Mobile Service Front Ends Entity Resolution System Data Collection Services

9. 2013 Keeping Data Fresh Network variance Data connections Usage Plans Push/Pull protocols Our solution: • We periodically update the data on a schedule in the background, in batch. • Active MQ & worker machines

10. 2013 Data Transfer Our solution: Thrift over Http and we only deliver objects since the last successful request. ThriftJSON Serialized Contact List Size Comparison GZip Thrift HTTP Updates

11. 2013 Storage Solution Engineering costs Operational costs Postgres Our solution: We settled on Postgres and treat it as a NoSQL key- value store. This saved engineering time as well as costs.

12. 2013 Entity Resolution System Machine learning Tunable Performance

13. 2013 Developing Mobile Applications Carrier variance Test matrix Device variance Platform solutions

14. 2013 Got Feedback? Rate and Review the session using the GHC Mobile App To download visit www.gracehopper.org

Editor's Notes

My name is Kristine Delossantos and I am a Software Engineer on the Mobile Team at Whitepages. I wanted to take this time to talk about the technical workings of an Android application we released last August called Current Caller ID and leave you with some key takeaways from our development experience.
I’ll start off with a quick overview of our app.Then I’ll show an architectural diagram.Afterwards I’ll get to the key problems and our current solutionsFirst.. What is Current Caller id?
We have a sweet call alert. Not only will it tell you who’s calling, but it’ll also integrate Facebook, LinkedIn, and Twitter data to show what is happening with them in real-time.
In the app, you can access a consolidated call/text log with one tap easy access to your top contacts.
Then, we make it visual. We have sharable infographics that show insights into your communication style. They show you who you communicate with, when you communicate, and how. Now I’ll give you more detail about the technical side of things by giving you this scenario……
Meet Spongebob.He just got a new cell phone, installed Current, and wired it up to his Facebook account.He posted a status to Facebook with his new number, telling his friends to text him.
Patrick,Squidward, MrKrabs, and Sandy are so excited that Spongebob finally has a phone, so they all text him to get Krabby Patties to celebrate.Current recognizes these as new contacts and gets to work.
Current sends the data to our servers and we store it for further processing. Our front ends deliver a message to an asynchronous messaging queue system alerting the data collection services of the new contacts.Our data collection services pick that up and reaches out to our whitepages data and social networks to collect more information about the contacts. Then we store it.Our data collection services deliver another message to our Active MQ pipeline alerting the entity resolution system that we’ve collected information that needs to be resolved together.The entity resolution system picks that up and fetches data from our contact graph store. (I’ll get into more detail about the Entity Resolution system in a bit, but) It resolves the data, stores it, and sends it back to the client. Now I’ll dive into the key takeaways we learned while trying to make all of this work.
When dealing with large data sets, you want to make sure you keep it fresh, and do it efficiently. You don’t want to violate your customer’s trust by usingt up their data plan.We first needed to decide between a push or a pull protocol. Since the client triggers updates from the server, we didn’t need realtime updates every step of the way. Whenever a change in your Call/Text log or the Address book happens, the client sends the changes over to the server and then increases the polling frequency to fetch the updates to any new associations that have been created on the server, and then the client refreshes the UI. Doing real-time lookups is not fast enough to present a rich call alert in a timely fashion. Additionally, when we first started, CDMA was more prevalent so simultaneous voice and data communication wasn’t possible. So we chose to pull to minimize our customers data usage while still responding to updates quickly.The system was designed to perform these jobs on their own by using ActiveMQ, a popular open source messaging queue system, and a scalable host of worker machines to process messages delivered to the queue and update our databases.The key takeaway here is that it was best to deliver data as its available in an asynchronous fashion and deliver only new data, that way the user experience doesn’t suffer with long wait times and loading screens.
When transferring large sets of data, you want to pay close attention to using smaller serialization schemes. Keep in mind that the mobile device may not always be connected and make sure your app can handle that. When choosing our transfer protocol, we realized that HTTP was easiest to plug into our infrastructure. Then we compared Thrift and json for the format of our data. Json can be compressed and is easy to debug, but ideally we wanted to keep payloads as small as possible, and thrift was best for the job in its compact binary form.Compact binary thrift compared to JSON, with the same data set, cut payloads in ~½. We usedGzip since the HTTP protocol supports Gzip compression so it was a widely available compression scheme, which gave us an average 30% savings under thrift. We also make sure we only deliver data that has changed in batches so the client only receives data that is necessary. Make sure to be cognizant of payload size from a serialization format perspective, compression perspective, and overall structural perspective (choice of delivering only deltas)When you’re dealing with large sets of data, you will probably need to store that data somewhere.
It is important for your storage solution to be fault tolerant, maintain consistency, and scale horizontally.You might want to consider using data partitioning for increasing I/O and maintaining scalability.We use postgres and treat it as a NoSQL key-value store. We use partitions to spread our data across multiple databases.A drawback with our solution was that it’s difficult to add more partitions without high engineering costs. We are currently exploring tools that can scale automatically so that adding capacity is a simpler task. One of the things we did that helps this effort was Early on in our development, we deliberately segmented our api and model code from underlying storage.It’s important to choose a data model that is efficient and flexible and choose a storage engine that can easily adapt to unexpected events.Make sure it meets the customer requirements, I/O requirements, and processing requirements. Keep in mind operational requirements and growth. Make sure it still works with 20x projection.If you’re developing an application that’s data centric you might need to detect separate records that refer to the same entities.
… which means you’ll need an entity resolution system. The Infolab at Stanford University defines Entity resolution as “locating and merging records that refer to the same real-world entities”. In our case, we needed one to match names.If an entity resolution system is required for your application, you want to make sure it is tunable and performs well.The obvious choice when it comes to developing large scale entity resolution systems is machine learning. We originally opted not to do machine learning because we had a predefined set of rules we thought were correct. As we tried to implement our system, we learned that it wasn’t as simple as we thought. You might want to consider machine learning upfront because in hindsight, we could have explored it more. The first step to building our entity resolution system was Defining the rules that would resolve two entities together. For example, to resolve two contacts together, they have to have a last name match from two different sources while the first name could be a nickname or complete match. We started with a decision tree to support the rules we had outlined.One drawback with the decision tree is that it scales very well vertically if you were to add additional rules but doesn't scale very well horizontally , in our example, if we were to add a few more social networks to match the contacts against, it wouldn’t be easy.We wrote tools to process sets of data that we could run user data samples against to see if we got the expected results based on the defined rules. This helped speed up iteration time significantly on further improving our match rate and the resolution engine.
I’d like to close out our talk with what to keep in mind when developing mobile applications. The mobile team at WhitePages has developed several applications in the past but this one was particularly interesting and we came out of it with several takeaways. When you are conceptualizing an idea, the first step is to evaluate the feasibility of the product by exploring various platforms and evaluating the capabilities available to you. For instance, in our case iOS doesn’t provide access to call history or any kind of call/text communication data. The platform we targeted for current was Android as it gives us most access to enable caller ID functionality. We also noticed during development that since we were working with private APIs, there was a lot of variation between implementations on different carriers and manufacturers.For example1) Annotation of call type and notifications of incoming/outgoing calls are different among devices and carriers. 2) Current allows blocking calls and texts, and for blocking calls on HTC, we had to set additional state so it would respond to pick up and hangup API calls. To avoid surprises we’d highly recommend defining your device matrix for testing well ahead of time and note this can be very different from the top devices published on the platformbased on the demographics you are targeting and the nature of your product, so do your research well ahead of time.
This is the last slide and must be included in the slide deck

Technical Anatomy of a Caller ID Android App

Recommended

Recommended

More Related Content

Similar to Technical Anatomy of a Caller ID Android App

Similar to Technical Anatomy of a Caller ID Android App (20)

Recently uploaded

Recently uploaded (20)

Technical Anatomy of a Caller ID Android App

Editor's Notes