Published on

Presentation of the ONTOCOM model at the SemTech 2008.

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Development effortestimation for large scale business ontologies Presentation Dr. Elena Simperl Dr. Christoph Tempich 20.05.2008 Member of
  2. 2. Content1. Motivation2. Framework3. Cost Driver4. Case Study5. Conclusion 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Page 1
  3. 3. Development effort estimation for large scale business ontologiesManagement SummaryOntocom is a framework to help you estimate the effort related to the building of anontology. It make accurate predictions and can be improved with data from your team. Ontologies are the key enablers for knowledge exchange across the Web Ontocom is a framework to estimate the effort related to ontology building Ontocom comes with  a process for effort estimation  a formula and a tool calculating the estimations and  a methodology to adjust the estimations to a particular company. Ontocom takes the size, the domain, the development complexity, the expected quality and the experience of the staff as input factors 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Ontocom estimates ontology building costs with a 30% accuracy in 80% of the cases We successfully applied the methodology in an ontology development project within a large German telecommunication operator as part of a SOA project. Page 2
  4. 4. 1. ContentPage 3 Motivation 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT
  5. 5. MotivationKey challenges for telecommunication operatorsFor telco IT organizations, the migration from today´s vertical, silo-type architecturetowards a flexible, NGN compliant, horizontal architecture is a disruptive step. Traditional Telco Stove Pipe Architecture Future NGN IT Layered Architecture Customer Customer Voice Mobile Data Integrated, convergent Services CRM, Sales, Order Management CRM, Sales CRM, Sales CRM, Sales Order Mgt Order Mgt Order Mgt Fulfillment Assurance Billing Assurance Fulfillment Assurance Fulfillment Assurance Fulfillment Billing Billing Billing 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT NGN service delivery Fixed Network Access Mobile Network Service Service Service Access Delivery Delivery Delivery vertical architecture horizontal architecture Page 4
  6. 6. MotivationReference ArchitectureMany operators base this architectural change on a Service Oriented Architecture (SOA)in which an ontology enables communication between different applications. Customer Portal Partner Portal Service Factory Customer Management 3rd Party Exposure Order Problem Billing SDP Management Management Policy Enforcement Invoicing Network based Computing Infrastructure Service Elem. OntologySecurity Federated ID BPM Master Data Mgmt. ERP Service Bus Connectivity Control Provisioning Management Service Creation Network Dynamic Resource IP/MPLS Transport Resource Discovery 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Control Service Assurance Resource Activation Access Resource Management Service Management Customer Equipment Supplier Management Network Factory Virtualization, ATCA, Blades, Storage Page 5
  7. 7. MotivationImplementation TimelineThe implementation of the new architecture is a long time effort and an accurate esti-mation of development efforts is the pre-requisite for the creation of a good project plan. Elements to align Aligned Project Plan Feb Mar Apr Activity 06 07 08 09 10 11 12 13 14 15 16 17 CRM IT Process IT Data application Resource IT Process 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Data Process Service Ontology model IT Process Data Page 6
  8. 8. MotivationOntocomOntocom is a framework to estimate the effort to develop an ontology. Ontocom consistsof a process, a formula and a methodology. Objective Elements 5? Process Ontocom 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Formula Methodology CHARTPOOL_A4.PPT Page 7
  9. 9. Content2. Framework Process Formula Example Methodology 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Page 8
  10. 10. Ontocom FrameworkCost estimation in ontology engineering processesThe project manager of an ontology development project estimates the effort forbuilding, documenting and evaluating the ontology during the requirements analysis. Requirements analysis motivating scenarios, use cases, existing solutions, Knowledge acquisition effort estimation, competency questions, application requirements Documentation Evaluation Conceptualization conceptualization of the model, integration and extension of existing solutions 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Implementation implementation of the formal model in a representation language Page 9
  11. 11. Ontocom MethodologyProcessApplying Ontocom is easy and follows a five step process. The project manager definesthe different parameters based on the process guidelines which are part of the framework. Step 1 Step 2 Step 3 Step 4 Step 5 Eval uatio Eval Eval Eval n of uatio uatio uatio devel n of Size n of Effort n of op- expe Estimation pers estimation dom ment cted onne ain com quali l plexi ty 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT ty Page 10
  12. 12. Ontocom FrameworkEffort Estimation FormulaThe formula uses information collected in the ontology development process and ofhistorical information collected from previous projects to make the effort estimation.. Parametric Effort Estimation Method PM = A * (Size ) * ∏ CD i B Person Normaliza- Size of the Cost Month tion Factor Ontology Drivers 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Learning Factor Result Input from project manager Input from methodology Page 11
  13. 13. Ontocom MethodologyExampleThe parameters associated with the different cost drivers are predefined in ourcalculation tool. Effort Estimation Formula Person Size of the Cost Month Ontology Drivers Quality of personnel Development complexity very high very high 6,9 PM = 500 Entities * high X high 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT X average average low low very low very low Page 12
  14. 14. Ontocom MethodologyMethodology to adapt Ontocom to your companyFor a high accuracy of the model we calculated the parameters aggregating theexperience of well over 40 ontology engineering projects. And counting. Model generation Data collection Data analysis Model Usage Model calibration Specify cost Collect data Analyze data Calibrate Evaluate Release drivers model model model Effort estimations 12.000 11.000 10.000 9.000 +/ -30% tolerance 8.000 average 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT 7.000 estimation 6.000 5.000 4.000 3.000 2.000 1.000 0 0 4 8 1 1 2 2 2 3 3 2 6 0 4 8 2 4The accuracy of the model increases if it is adapted and calibrated with data from your own business. Page 13
  15. 15. 3. ContentPage 14 Cost Drivers 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT
  16. 16. Cost driversModel applicationStep 1: Size of the ontology Explanation GuidelinesThe size of the ontology. This includes all first class  Determining the size of a prospected ontology iscitizens of an ontology. Size is measured in kilo a challenging task in an early stage of theentities. ontology development process. All class definitions  Existing domain ontologies can help to get a All attribute definitions rough capture. All relationship definitions 1. Search for existing domain ontologies. All rule definitions 2. Compare coverage of existing domain ontologies with the required level of detail Examples 3. Calculate expected size of the new ontologyAn ontology has 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT 500 classes 700 attributes 300 relations no rulesThis totals in 1.5 k entities. Page 15
  17. 17. Cost driversModel applicationStep 2: Evaluation of the domain Explanation GuidelinesThe Domain Analysis Complexity accounts for DOMAINthose features of the application setting which  Very Low: narrow scope, common-senseinfluence the complexity of the engineering knowledge, low connectivityoutcomes. It consist of three sub categories: The domain complexity  Very High: wide scope, expert knowledge, high connectivity The requirements complexity REQUIREMENTS The available information sources  Very Low few, simple req. Examples  Very High: very high number of req. with a high conflicting degree, high number of usability An ontology for the cooking domain, having a requirements 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT low number of requirements and a high number of available information sources has a very low INFORMATION SOURCES to low domain complexity.  Very Low high number of sources in various An ontology for the chemistry domain, with a forms high number of requirements and a low number  Very High none of available information sources has a high to very high domain complexity. Page 16
  18. 18. Cost driversModel applicationStep 3: Evaluation of the development complexity Explanation Guidelines The Conceptualization Complexity accounts CONCEPTUALIZATION for the impact of a complex conceptual model  Very Low: concept list on the overall costs  Very High: instances, no patterns, considerable The Implementation Complexity takes into number of constraints consideration the additional efforts arisen from the usage of a specific implementation language IMPLEMENTATION  Low: The semantics of the conceptualization Examples compatible to the one of the implementation language An ontology for a search application with an 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT thesaurus has a low development complexity.  High: Major differences between the two An ontology for the chemistry domain, modeling reaction patterns has a high development complexity. Page 17
  19. 19. Cost driversModel applicationStep 4: Evaluation of expected quality Explanation Guidelines The Evaluation Complexity accounts for the ONTOLOGY EVALUATION additional efforts eventually invested in  Very Low: small number of tests, easily generating test cases and evaluating test generated and reviewed results. This includes the effort to document the ontology.  Very High: extensive testing, difficult to generate and review Required reusability to capture the additional effort associated with the development of a REUSEABILITY reusable ontology,  Very Low: Ontology is used for this application only Examples  Very High: Ontology should be used across An ontology which is used for one application many applications as an upper level ontology 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT only without extensive testing has a low factor. An integration ontology which should be used across an entire organization or for many web users with high documentation requirements has a high or very high factor. Page 18
  20. 20. Cost driversModel applicationStep 5: Evaluation of personnel Explanation Guidelines Ontologist/Domain Expert Capability accounts ONTOLOGIST/DOMAIN EXPERT CAPABILITY for the perceived ability and efficiency of the  Very Low: 15% single actors involved in the process (ontologist and domain expert) as well as their teamwork  Very High: 95% capabilities. ONTOLOGYIST/DOMAIN EXPERT EXPERIENCE Ontologist/Domain Expert Experience to mea-  Very Low: 2 month (ontology) / 6 month sure the level of experience of the engineering (domain) team w.r.t. performing ontology engineering.  Very High: 3 years (ontology) / 7 years Examples (domain) The new project member who has never worked 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT with ontologies nor has any experience with the domain has a very low expert experience. The project manager who has been working with ontologies for several years and is experienced in a certain field has a very high expert experience. Page 19
  21. 21. 4. ContentPage 20 Case Study 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT
  22. 22. Case studyChallengeWe used the Ontocom process and formula to create a project plan for the developmentof an integration ontology. The challenge was to estimate the size of ontology. Step 1 Step 2 Step 3 Step 4 Step 5 Eval uatio Eval Eval Eval n of uatio uatio uatio devel n of Size n of Effort n of op- expe Estimation pers estimation dom ment cted onne ain com quali l plexi ty 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT ty Page 21
  23. 23. Case studyStep 1: Size of the ontologyFor the integration ontology the TMForum has developed the Shared Information DataModel (SID). SID model overview Process Existing domain ontologies  The TMForum is an industry organization of telecommunication operators and their vendors.  The SID is a UML model formalizing the knowledge relevant to an operators business  It has approx. 4.8 k UML elements 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Coverage of the domain  Initial analysis shows that average coverage across all sub domain is at around 40%  The estimate size is 12 k CHARTPOOL_A4.PPT entities. Page 22
  24. 24. Case studyStep 1: Size of the ontologyFor a detailed estimation of the required person month to build the ontology in therespective sub domains we analyzed the number of classes in each sub domain. Market / Sales Market Strategy & Plan Market Segment Marketing Campaign Competitor Contact/Lead/Prospect Sales Statistic Sales Channel 52 Customer Customer Customer Interaction Customer Order Customer Statistic Customer Problem Customer SLA Applied Cust. Bill. Rate Customer Bill 74 Customer Bill Collection Customer Bill Inquiry Product Product Product Specification Strat. Prod. Portf. Plan Product Offering Product Performance Product Usage Statistic 66 Service Service Service Specification Service Applications Service Configuration Service Performance Service Usage Service Strategy & Plan Service Trouble 124 Service Test Resource Resource Resource Topology Resource Performance Resource Strat. & Plan 241 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Resource Specification Resource Configuration Resource Usage Resource Trouble Resource Test Supplier / Partner S/P Performance S/P Bill Supplier/Partner S/P Plan S/P Interaction S/P Product S/P Order S/P SLA S/P Problem S/P Statistic 8 S/P Bill Inquiry S/P Payment Enterprise Common Business (Under Construction) 25 Party Location Business Interaction Policy 431 Agreement Page 23
  25. 25. Case studySteps 2 - 5We rated the remaining factors in accordance with the circumstances found in theproject and following the guidelines of the framework.Step 2 Step 3 Step 4 Step 5 Evaluation of Evaluation of Evaluation of Evaluation of development domain expected quality personnel complexity The telecommuni-  The development  The expected  The personnel cation domain has of the ontology quality was rated was rated slightly an above average has an average very high below average. domain complexity complexity.  Due to the cross  Experience with Although there are  The ontology will organizational use ontology building many information be formalized as documentation was limited in the sources available an UML diagram has to very team. 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT detailed there are many  It was not planned  Modelers with different to model rules or  We needed to experience in the requirements axioms evaluated the telecommunica- ontology tions industry are and it is a broad thoroughly. difficult to find domain. Page 24
  26. 26. Case studyEffort estimation for the development of the data model: June 2007The model size will increase slowly and it will take approximately 26 person month todevelop the complete ontology if engineers can continuously work on it. Effort estimations Result Estimation: 12.000  We used a Excel tool in order to compute the 11.000 duration of the development.* We varied the 10.000 number of entities in order to show the increase 9.000 +/-30% tolerance in entities over time. no. of entities 8.000 Implications: 7.000 average estimation  If modelers stay in the development team the 6.000 learning rate is quite high. 5.000  For the development of the entire model with 4.000 approx 12.000 UML elements we estimated an 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT 3.000 effort of 24 month. However, as it is a prediction 2.000 variations of around 30% are still within the 1.000 range of the model. 0  To introduce the first 600 elements plan six 0 5 10 15 20 25 30 35 month. person month*:The model was not adapted with data from the operator. Page 25
  27. 27. Case studyEffort estimation for the development of the data model: June 2007In order to get a rough estimation for the finalization of the different domains we dividedthe overall engineering effort into several sub tasks. Effort estimations average 12.000 estimation +/-30% tolerance Enterprise Supplier / Partner 11.000 Market / Sales 10.000 9.000no. of entities 8.000 Product / Service 7.000 6.000 Customer/ Resource 5.000 4.000 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT 3.000 Common business 2.000 1.000 0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 person month Page 26
  28. 28. Case studyComparison with actual numbers in February 2008The actual effort is higher than expected. This is mainly due to frequent changes in themodeling team and to technical problems aligning the process and ontology model. Actual Effort Evaluation Changes in the development team: 12.000  The team consisted of in average 4 people. 11.000 10.000  The team structure changed quite often due to 9.000 management decisions Entitiesno. of entities 8.000  This required experienced modelers to train 7.000 newcomers 6.000 Aligning the process model with the ontology: 5.000  Tool support to define the data objects required 4.000 for activities in a process model is limited 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT 3.000  The original model does not account for the 2.000 integration of an ontology with a process model 1.000 Size 0  The estimate of the size of the ontology is 0 5 10 15 20 25 30 35 relatively good person month  The project is ongoing Page 27
  29. 29. 4. ContentPage 28 Conclusion 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT
  30. 30. Lessons learned  Ontocom is a Previous experience with ontology building implies a high learning rate. This1 straightforward can only be achieved if the ontology engineering team is constant. methodology to estimate the effort Identify existing taxonomies, models, classifications in order to get a rough2 capture of the size of the ontology. related to ontology engineering. Evaluation and documentation of the model for a better reusability is the most  The experience3 time consuming part. incorporated in the methodology includes best practices for4 A clear methodology is essential. ontology engineering. 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT  The model can be5 The target picture must be clear to increase efficiency. improved with calibration data from your company. Ontology development differs from data modeling and experts are difficult to6 find. Page 29
  31. 31. Thank you.Find out more about Ontocom at: http://ontocom.ag-nbi.de/ Member of
  32. 32. Contact Dr. Elena Simperl Digital Enterprise Research Institute University of Innsbruck ICT Technologiepark Technikerstr. 21a 6020 Innsbruck (Austria) Phone: +43 512 507 96884 Fax: +43 512 507 9872 Mobile: +43 664 812 5236 e-Mail: elena.simperl@deri.at Dr. Christoph Tempich Detecon International GmbH 080413_CT_SEMANTIC TECHNOLOGY_V04.PPT Industry/Competence Practice IT Oberkasseler Str. 2 53227 Bonn (Germany) Phone: +49 228 700-1942 Fax: +49 228 700 – 2361 Mobile: +49 (151) 12720065 e-Mail: Christoph.Tempich@detecon.com Page 31