2. Alfred Rotondaro
Page 2
FQL Cache Invalidation
Table of Contents
1 Overview and Purpose.................................................................................................................3
1.1 Audience.............................................................................................................................3
2 Problem Description....................................................................................................................3
3 Timestamp Records.....................................................................................................................3
3.1 Timestamp Record Job.........................................................................................................3
3.1.1 JAMS Processing...........................................................................................................4
3.1.2 Database Updates ........................................................................................................4
3.1.2.1 FDB Databases..........................................................................................................4
3.1.2.2 Non-FDB Databases ..................................................................................................4
4 Namespace Translations..............................................................................................................4
4.1 Database-to-Namespace Mapping........................................................................................4
4.2 Formula-to-Namespace Mapping..........................................................................................5
5 FQL Engine..................................................................................................................................6
5.1 Timestamp Processing..........................................................................................................6
5.1.1 Timestamp Implementations.........................................................................................7
5.1.1.1 Timestampincluded in Key........................................................................................7
5.1.1.2 Timestampincluded in Data Value.............................................................................7
5.2 Client Priming......................................................................................................................7
5.2.1 Client Priming Caveats ..................................................................................................7
5.3 Broadcasting Database Updates ...........................................................................................7
Appendix I – Formula Cache Invalidation Architecture ..........................................................................8
Appendix II – Timestamp Key and Record Format.................................................................................9
3. Alfred Rotondaro
Page 3
1 Overview and Purpose
Thisdocumentdescribesthe use of timestamprecordstodetermine whendatarecordsin Memcached
are invalid –i.e. they are no longerup-to-date. Thisdocumentfurtherexplainshow we canuse this
mechanismtocreate cachesthat are dynamicallypopulated byclients.
1.1 Audience
ThisdocumentiswrittenforSoftware Engineers whoworkwith applicationsthatuse FQLto fetchdata.
2 Problem Description
The purpose of caching FQL formulasisto improve applicationperformance byreducing calculation and
I/Otime. However, itisalsoimportanttoensure the timelinessof cacheddata.
3 Timestamp Records
Timestamprecordsprovide amechanismformonitoringthe freshness of datainMemcached. The
essential fieldsof atimestamprecord are symbol andtime. AppendixIIprovidesacomplete listingof
the fieldsincludedinatimestamprecord.
The timestamprecordlistsall the identifiersalongwiththe mostrecenttime anIDwas updatedinany
of the databasesina givennamespace. Figure 1showsa sample timestamprecord.
Key Value
$$TIMESTAMP _FF FDS 10:00
MSFT 10:00
IBM 10:00
GOOG 10:10
MMM 10:05
$$TIMESTAMP_MSCI MMM 10:15
T 10:00
3.1 TimestampRecordJob
A special jobcalledthe TimestampRecordJobisusedtoinserttimestamprecordsintoMemcached.
Figure 2 showsthe TimestampRecordJob processingdatabase updates. Thisjobcanbe scheduledto
run throughJAMS or can be launched automatically bydatabase updates.
Figure 1: Example of a Timestamp Record
4. Alfred Rotondaro
Page 4
Non-
FDB
Timestamp
Record JobUpdate
JAMS
Symbol-TIme
Roll Forward
FDB New
Timestamp
Old
Timestamp
Memcached
$$TIMESTAMP_FF –
FDS_10:00|IBM_10:00|MMM_10:05
Figure 2: Database Update Processing
3.1.1 JAMS Processing
The TimestampRecordJobis scheduledtorunevery15 minutesby JAMS(System:FRMLA_CACHE) to
ensure thatthe timestamprecordsexist. Whenthe jobisstarted,itretrievesthe timestamprecords
fromMemcached. If a record doesnotexist,thenitwill be regeneratedfromapersistentcopy orfrom
scratch. Additionally,the jobchecksthe database fileidstoensure thatthese fileshave notbeen
changed– i.e.copy-renamed.
3.1.2 DatabaseUpdates
The TimestampRecordJobcan be launchedthroughupdatesfromFDBand non-FDBdatabases. For
FDB databases,a roll forwardtriggermechanismisrequired,while non-FDBdatabasesrequire ascriptto
be calledbydatabase engineersforcopy-rename updates.
3.1.2.1 FDB Databases
Whenthe TimestampRecordJobis launchedbyFDBdatabase updates,the jobqueriesthe database for
symbol-update information usingthe existinghashtables. The symbol isnormalizedasa SEDOL.
3.1.2.2 Non-FDB Databases
For non-FDBdatabases,the database engineermustprovide aninputfile specifyingthe symbol-time
information.
4 Namespace Translations
The formulacache invalidationsolutionis predicatedontwotypesof mappings: database-to-namespace
and formula-to-namespace.
4.1 Database-to-NamespaceMapping
Duringthe processingof a database update,a database-to-namespace mappingisused todetermine
the namespace thatthe database belongsto. Basedon that information,apropertimestampkeyis
generated.
5. Alfred Rotondaro
Page 5
The actual database-to-namespace mappingis storedina configurationfile. Afterthe mappingis
loaded,the namespace isappendedto the timestamp key,which isthenusedto create the timestamp
record. Figure 3 shows the TimestampRecordJobusingthe database-to-namespace mappingtoupdate
a timestamprecord.
Key Value
$$TIMESTAMP _FF FDS 10:00
MSFT 10:00
IBM 10:00
GOOG 10:10
MMM 10:05
$$TIMESTAMP_MSCI MMM 10:15
T 10:00
Namespace Database
FF FF_ANNUAL
FF_FIELD
FF_MONTH
MSCI MSCI_ACE
MSCI_CHINA_A_CON
Database-to-Namespace
Mapping
Timestamp Update
Timestamp
Record Job
4.2 Formula-to-NamespaceMapping
While processingclientrequests,the formula-to-namespace mappingandthe timestamp are used. The
componentsandstepsinvolvedinprocessingaclientrequestare shownin Figure 4.
The formula-to-namespace mappingis storedinthe file fql_cacheable_formula.txt. Basedon the
mapping,the appropriate namespaceisappendedto the timestamp key,whichis thenusedtoretrieve
a timestamprecord containingsymbolsandtheircorrespondingmostrecenttime of update. Thistime
isthenappendedtothe cache key that isusedto fetchdata fromMemcached. In case of a cache miss,
the same cache keyisusedto insertdata intoMemcached.
Figure 3: Timestamp Record Processing
6. Alfred Rotondaro
Page 6
Figure 4: Client Request Processing
4) In case of a cache miss, data is inserted into Memcached, using the cache key.
Memcached
FQL Engine
Client
Cache
Miss
Formula
DB
Data
1
Timestamp
3 2
$$TIMESTAMP_FF –
FDS_10:00|IBM_10:00|MMM_10:05
4
1) Formula to Namespace Mapping.
Symbol Formula Namespace
IBM FF_Sales FF
Key Value
$$FQL_CACHING_TIMESTAMP_FF IBM 10:00
FF_Sales( )_IBM_10:00
2) Timestamp retrieved from Memcached.
3) Timestamp appended to cache key to fetch data from Memcached.
5 FQL Engine
The FQL Engine retrievesandinterpretscachingtimestampsandthenprimesMemcachedwithupdated
data. AppendixI showsthe FQL Engine inrelationtothe overall designof the formula cache
invalidationarchitecture.
5.1 TimestampProcessing
At the start of a download/reportsession,the FQLEngine usesasingle fetchtoretrieve from
Memcachedthe timestamprecordsfora namespace. These records,whichare storedinprocesscache
7. Alfred Rotondaro
Page 7
for one minute,are copiedtoFQL interpreterobjects,where the timestamprecordsremainineffect
until the endof the download. The time inthese recordsisthenusedtoensure thatthe clientisgetting
the most recentdata.
5.1.1 Timestamp Implementations
There are twooptionsforimplementingthe trackingof timestamps: the firstoptionistomake the
timestamppartof the key,while the secondoptionistoembedthe timestampintothe datavalue.
5.1.1.1 Timestamp included in Key
Thisis the optionthatis beingimplemented. The advantage of thisoptionisthatit resultsinmore true
hits,while the disadvantage isthatitcreatesmore keys andthus requiresmore storage.
5.1.1.2 Timestamp included in Data Value
Thoughnot currentlybeingimplemented,the advantage of thisoptionisthatitrequireslessstorage,as
it justoverwritesexistingkeys. The disadvantagesof thisoptionare thatitresultsinfalse hitsandalso
requiresmore post-processingtoextractthe key.
5.2 ClientPriming
Clientsare allowedtoinsertintoMemcachedoncache misses. However,onlythe latestdataistobe
insertedintoMemcached. Therefore,itisnecessarytodetermine whetherthe datareturned fromthe
databasesreflectthe latestavailabledata,asa clientmighthave ahandle toa stale database.
5.2.1 Client PrimingCaveats
The followingcaveatsapplytoclientpriming:
Stale handlestoa database file willresultinthe insertionof stale data.
Clientaccessishandledona case-by-case basis.
Relative datesare usedwithdate-manipulatingformulasbyappendingthe CalendarandZero
date to the cache key.
5.3 Broadcasting DatabaseUpdates
At the database level,fdb_database usesDLMto signal toreadersof the database whetheranupdate is
available. Thisinformation ispropagatedtothe FQLEngine todetermine whethertoinsertthe formula
resultintoMemcached. Whendata isinsertedintothe cache,an“is_updated”flagispropagatedup
fromthe database. If the “is_updated”flagisfalse,thendataisinsertedintothe cache usingthe
timestamppreviouslyretrievedfromMemcached.
8. Alfred Rotondaro
Page 8
Appendix I – FQL Cache Invalidation Architecture
Timestamp
Record JobUpdate
Non-
FDB
JAMS
Symbol-TIme
Roll Forward
New
Timestamp
Old
Timestamp
FDB
Memcached
$$TIMESTAMP_FF –
FDS_10:00|IBM_10:00|MMM_10:05
FQL Engine
Client
Cache
Miss
Formula
DB
Data
Timestamp
9. Alfred Rotondaro
Page 9
Appendix II – Timestamp Key and Record Format
The timestampkeyconsistsof the prefix $$FQL_CACHING_TIMESTAMP alongwithaspecificnamespace
appendedtoit: for example, $$FQL_CACHING_TIMESTAMP_FF. The timestamprecordisa structure
withthe followingdataelements:
{
Int version;
Char fdsTableNumber[8];
U_int maxSymbolLength;
U_int numberOfSymbols;
Time_tearliestTime;
Time_tlatestTime;
Struct symbolMap
{
Char symbol [maxSymbolLength];
Time_ttime;
};
SymbolMapaSymbolMap [numberOfSymbols];
};
Note: The format of the timestamprecordissubjecttochange pendingperformance results.