Talk 1: Google App Engine Development: Java, Data Models, and other things you should know (Navin Kumar, CTO of
Upcoming SlideShare
Loading in...5

Talk 1: Google App Engine Development: Java, Data Models, and other things you should know (Navin Kumar, CTO of



Talk 1: Google App Engine Development: Java, Data Models, and other things you should know ...

Talk 1: Google App Engine Development: Java, Data Models, and other things you should know
(Navin Kumar, CTO of
Description: Google AppEngine is a cloud architecture designed to run and scale your own web applications. It makes it easy to develop applications, and with the introduction of Java language support, allows you to develop using the standard servlet development model and along with GWT allows end-to-end Java development of powerful Ajax-based web applications. Here we will describe the tips and tricks that are used to develop Socialwok, a rich,on-demand enterprise microblogging platform, which is deployed on Google AppEngine using the Java language support. Finally we will introduce a neat example that illustrates some of the tricks that can be used on AppEngine to develop your own applications.



Total Views
Views on SlideShare
Embed Views



10 Embeds 85 32 24 11 9 3 2 1 1 1 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Thanks for this presentation!
    I was looking everywhere for a java example for Relation Index.

    On slide 24, you show how to get all msgs where userid is a recipient.
    Assuming a msg has fields I want to filter by (e.g. Date, Topic)
    How would you query msgs where userid is a recipient and also filter by date and topic?

    The only solution I can think of is duplicating these Date and Topic fields to the MessageRecipients class.

    Does this make any sense? would it reduce the efficiency of the keys-only query?

    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • An update of a entity occurs in a transaction that is retried a fixed number of times if other processes are trying to update the same entity simultaneously. Your application can execute multiple datastore operations in a single transaction which either all succeed or all fail, ensuring the integrity of your data.
  • Before this slide, switch to application.

Talk 1: Google App Engine Development: Java, Data Models, and other things you should know (Navin Kumar, CTO of Talk 1: Google App Engine Development: Java, Data Models, and other things you should know (Navin Kumar, CTO of Presentation Transcript

  • Google App Engine Development Java, Data Models, and Other Things You Should Know Navin Kumar Socialwok
  • Introduction to Google App Engine
      • Google App Engine is an on-demand cloud platform that can be used to rapidly develop and scale web applications.
      • Advantages:
        • You are using the same architecture and tools that Google uses to scale their own applications. 
        • Easy to develop your own applications using Java and Python
        • Free Quotas to get you started immediately.
  • Java Support on Google App Engine
      • Java support was introduced on April 2009
      • Remarkable milestone for several reasons:
        • Brought the Java Servlet development model to Google App Engine
        • You can use your favorite Java IDE to develop your applications now (Eclipse, NetBeans, IntelliJ)
        • Database development is easy with JDO and JPA
        • Not only limited to the Java Language, but ANY JVM-supported language can be used (JRuby, Groovy, Scala, even JavaScript(Rhino), PHP etc.)
  • Eclipse Support and GWT
      • Eclipse is the premier open source Java IDE, and with the Google Plugin for Eclipse, developing Google AppEngine apps can be done very easily.
      • Eclipse will automatically layout your web application for you in addition to providing 1-click deployment.
      • GWT is also supported by the Eclipse plugin, and can also be used along with your Google AppEngine codebase.
        • End-to-end Java development of powerful Java-based web applications.
  • Google Plugin for Eclipse (GWT and AppEngine)
  • BigTable: Behind Google's Datastore
      • BigTable: A Distributed Storage System for Structured Data ( )
        • Built on top of GFS (Google File System) ( ) 
      • Strongly consistent and uses optimistic concurrency control
      •   But it's not a relational database 
        • No Joins or true OR queries
        • "!=" is not implemented
        • Limitations on the use of &quot;<&quot; and &quot;>&quot;
  • Data Models
      • DataNucleus ( ) is used to handle the Java persistence frameworks on AppEngine
      • 2 Choices: JDO (Java Data Objects) or JPA (Java Persistence API) (JPA will be very familiar to those who have used Hibernate or EJB persistence frameworks)
      • Both involve very similar coding styles.
      • For this talk, we will focus on JDO, but JPA is very similar, so the same concepts can be applied.
      • There is also a low-level datastore API that we will touch on as well
  • Defining Your Data Model
    • package;
    • import;
    • import javax.jdo.annotations.*;
    • import;
    • @PersistenceCapable(identityType = IdentityType.APPLICATION)
    • public class Post implements Serializable {
    •     private static final long serialVersionUID = 1L;
    •     @PrimaryKey
    •     @Persistent(valueStrategy=IdGeneratorStrategy.IDENTITY)
    •     @Extension(vendorName=&quot;datanucleus&quot;, key=&quot;gae.encoded-pk&quot;, value=&quot;true&quot;)
    •     private String id;
    •     public String getId() { return id; }
    •     @Persistent
    •     private String title;
    •     public String getTitle() { .. }
    •     public void setTitle(String title) { .. }
    •     @Persistent
    •     private Text content;
    •      public String getContent() { .. }
    •      public void setContent(String content) { .. }
    •     ..
    • }
  • Creating, Deleting, and Querying
      • At the heart of everything is the PersistenceManager
    •                PersistenceManager pm =  PMF.get().getPersistenceManager();
    •       Post post = new Post();
    •       post.setTitle(&quot;Title&quot;);
    •       post.setContent(&quot;Google AppEngine for Java&quot;);
    •       try {
    •          pm.makePersistent(post);
    •       }   
    •               pm.close();
    •       ...
    •       Post deleteMe = pm.getObjectById(Post.class, deleteId);
    •       try {           pm.deletePersistent(deleteMe);
    •       }
    •       ...
      • Build queries using JDOQL
    •        Query query = pm.newQuery(Post.class);
    •       query.setFilter(&quot;title == titleParam&quot;);
    •       query.declareParameters(&quot;String titleParam&quot;);
    •       query.setUnique(true);
    •       Post post = (Post) query.execute(&quot;Title&quot;);
  • Relationships
      • Owned one-to-one and one-to-many
      • @Persistent(mappedBy=&quot;field&quot;) annotation syntax.
      • Unowned relationships (one-to-one, one-to-many, many-to-many)
      • @Persistent Key otherEntity ;
      • @Persistent List<Key> otherEntities;  
      • Owned relationships create a parent-child relationship
        • Parent and child entities are stored in the same entity group
        • Entity group defines a location in the datastore
        • This is important because Transactions on the datastore can only be applied over a single entity group
  • Other APIs you should be aware
      • UsersService
        • Don't write a login, use Google's!
      • ImagesService
        • Picasa image manipulation web services
      • Memcache
        • Distributed cache for objects
        • Very useful! More on this later...
      • URL Fetch
      • Mail service
        • Send outbound emails w/ some restrictions
      • APIs (except UsersService) subject to quota limitations
  • And now for the fun stuff...  
      • Enterprise social collaboration application built on Google App Engine. 
        • Utilizes a social concept of feeds (also referred to as presence and activity streams)
        • Combines the querying of reasonable complex data with privacy requirements of social networking.
      • Uses tons of Google App Engine APIs, Google APIs, and GWT.
      •   As we have built it, we have learned several aspects about Google App Engine that have allowed us to make the app reasonable fast and responsive.
  • Lesson 1: Utilization of Memcache
      • Data structure of each feed is relatively complex
        • At least 3 explicit unowned relationships
    •        @Persistent Key user
    •       @Persistent Key network
    •       @Persistent List<Key> attachments 
          • Requires querying for each these objects explicitly when representing in the feed.
      • Feed is fetched repeated by several (hundreds) concurrent users
        • There is need for the feed display to be reasonable responsive for all the different users
  • Lesson 1 (cont.) Solution: Memcache
      • Distributed in-memory cache 
        • Uses javax.cache.* APIs
        • Also, a lowlevel  API:*
      • Basic uses:
        • Speed up existing common datastore queries
        • Session data, user preferences
      • Cache data is retained as long as possible if no expiration is set
      • Data is not stored on any persistent storage, so you must be sure your app can handle a &quot;cache miss&quot;
  • Lesson 1: Memcache conclusions
      •   Works really well!
        • Responsive requests
        • 2 s. => ~800 ms. resp. time (60% decrease)
      • Cache data is generally retained for a very long time
      • Distributed nature of cache provides benefits to every user on the system.
        • The more people who use your app, the better your app performs**
      • Even free quota for Memcache is quite generous:
        • ~ 8.6 million API calls.
  • Lesson 2: Message Delivery Fanout
      • Adapted from Building Scalable, Complex Apps... from Google I/O by Brett Slatkin
      •   Basically deals with a problem of fan-out
        • Socialwok has a concept of &quot;following&quot; (which is basically a subscription between users)
        • In our case, one user posts a single message that needs to be &quot;delivered&quot;  to all his subscribers
        • How do we show the message efficiently to all his subscribers?
          • We can deliver the message by reference to its recipients.
  • Lesson 2 (cont.): RDBMS version 2 Primary Tables 2 Join Tables
      • To get Messages to display for the current user
      • SELECT * from Messages INNER JOIN UserMessages USING (message_id) WHERE UserMessages.user_id = 'current_user_id'
      • But there aren't any joins on AppEngine!
    User ID Name 1 Navin 2 John 3 Vikram Message ID Message User ID 1 Hello world 1 2 Another message 3 Follower ID Following ID 1 2 1 3 2 1 Recipient ID Message ID 1 34 1 67
  • Lesson 2: List Properties to the Rescue
      • A list property is property in the datastore that has multiple values:
      • @Persistent private Collection<String> values;
        • Represented in Java using Collection fields (Set, List, etc.)
        • Indexed in the same way that normal fields are
        • Densely pack information
        • Query like you query any single-valued property:
          • query.setFilter(&quot;values == 2&quot;);
    values Index key=1,values=1 key=2,values=2 key=2,values=1
  • Lesson 2: Our new data definition
      • Now we can define a collection field to store the list of recipients
      • public class Message {
      •     @Persistent private String msg;
      •     @Persistent private List<String> recipients;
      •      ...
      • }
      •   Query on the collection field:
      • Query query = pm.newQuery(Message.class);
      • query.setFilter(&quot;recipients == recptParam&quot;);
      • List<Message> msgs = 
      •      (List<Message>) query.execute(currentUserId);
      • But there is one issue with this:
        • Serialization overhead when fetching the messages
        • We don't really care about the contents of this field when displaying the messages
        • So we will take advantage of another trick
  • Lesson 3: Keys-only Queries and AppEngine Key Structure
      • We can perform queries whose return values are restricted to the keys of the entity
        • Currently only supported in low-level datastore API
      • AppEngine keys are structured in a very special way
        •   Stored in protocol buffers 
        •   Consists of an app ID, and series of type-id_or_name pairs
          • pair is entity type name and autogenerated-integer ID or user-provided name
        • Root entities have exactly one of these pairs; child entities have one for each parent and their own
      • Presents a unique ability to retrieve a parent entity's key from the child entity's key
  • Lesson 3: A solution to our Serialization Problem
      • Now we can store the irrelevant recipients in a child entity
      • Here's the process:
        • Define a child entity with the recipients field
        • Store the recipients of the message in the child entity
        • Create a keys-only query on the child entity that filters on the recipients field.
        • Get a list of parent keys from the list of child keys
        • Bulk-fetch the parents from the datastore
  • Lesson 3 (contd.): Solution (Data Def.)
    • public class MessageRecipients {
    •     @PrimaryKey private Key id;
    •     @Persistent private List<String> recipients;
    •     @Persistent private Date date;
    •     @Persistent(mappedBy=&quot;msgRecpt&quot;) private Message msg;
    •     ...
    • }
    • public class Message {
    •     ...
    •      @Persistent private Date date;
    •     @Persistent private String msg;
    •     @Persistent private MessageRecipients msgRecpt;
    •     ...
    • }
  • Lesson 3 (contd): Solution (Querying)
    • DatastoreService dataSvc = ...;
    • Query query = new Query(&quot;MessageRecipients&quot;)
    •   .addFilter(&quot;recipients&quot;),FilterOperator.EQUAL,userid)
    •   .addSort(&quot;date&quot;, SortDirection.DESCENDING)
    •   .setKeysOnly();   // <-- Only fetch keys!
    • List<Entity> msgRecpts = dataSvc.prepare(query).asList();
    • List<Key> parents = new ArrayList<Key>();
    • for (Entity recep : msgRecpts) {     parents.add(recep.getParent());
    • }
    • // Bulk fetch parents using key list
    • Map<Key,Entity> msgs = dataSvc.get(parents);
  • Cool Trick: Lite Full Text Search
      • Most web applications nowadays need some form of full-text search
      • Well we are on Google AppEngine aren't we!
      • Google actually did really release a basic searchable model implementation
        • Limited to Python ( )
        • More info:
        • Proper full-text search is in the AppEngine roadmap
      • Some of our earlier lessons do apply here.
  • How do we build it
      • First, it helps to understand how a basic full-text search index works
        • First, break up the text into terms using lexographical analysis
        • Then store the terms in a lookup table based on key of the message
          • With List fields, Google AppEngine gives us this one.
        • We build queries using the same tricks.
      • We also apply the same tricks using child entities and key-only queries to optimize for the serialization overhead.
  • Live example
      • I have deployed a modified version of Google AppEngine guestbook example:
      • If anyone wants to &quot;sign&quot; it right now, please go ahead.
      • We will now search the data
        • Limited to 1-2 word queries
  • How it works.
      • Applies lessons from list fields and keys-only queries
    •   @Persistent Set<String> searchTerms;
      • Our &quot;lexigraphical analysis&quot;: Java regular expression
    •   String[] tokens = content.toLowerCase().split(&quot;[^]+&quot;);
        • Can use a full-text search library like Lucene to improve this part
      • Another cool feature of list properties: merge-join
        • Think about organizing your data in a Venn-diagram fashion and finding the intersection of your data.
        • Watch your indexes!
      • Can improve this implementation by using Memcache to cache common search queries.
      • Code will be made available after the talk, so you can take a good look for yourself!
  • Conclusions
      • Google AppEngine for Java provides a standardized way to build applications for Google AppEngine
      • In building Socialwok, we have learned several lessons that apply when building a scalable application on Google App Engine
      • Get the Searchable Guestbook code here:
      • In short, Google AppEngine development has never been easier and more interesting!
      • Get started by visiting:
  • Q & A