BigML.io is a RESTful API for creating and managing BigML resources programmatically. These slide explain how to create, retrieve, update and delete BigML Sources, Datasets, Models, and Predictions.
The document discusses GraphQL and Relay concepts including queries, mutations, fragments, and arguments. It also provides examples of GraphQL queries to fetch user and repository data, including nested and filtered data. Relay concepts like prefetch caching, server data updating, and optimistic updates are briefly mentioned as well.
The document provides an overview of GraphQL and GraphQL clients. It discusses:
- The evolution of APIs from RESTful to GraphQL, which provides a more efficient way to query complex data.
- How GraphQL uses a single endpoint and allows clients to specify exactly the data they need through queries.
- Basic GraphQL queries, including selecting fields, nested fields, arguments, variables, fragments, and mutations.
- GraphQL type definitions that serve as documentation.
- GraphQL clients like Relay that optimize data fetching and caching.
- Tools like GraphiQL that allow testing GraphQL queries in an interactive environment.
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
1) JSON-LD has seen widespread adoption with over 2 million HTML pages including it and it being a required format for Linked Data platforms.
2) A primary goal of JSON-LD was to allow JSON developers to use it similarly to JSON while also providing mechanisms to reshape JSON documents into a deterministic structure for processing.
3) JSON-LD 1.1 includes additional features like using objects to index into collections, scoped contexts, and framing capabilities.
Audio available: https://www.liferay.com/web/events-symposium-north-america/recap
Liferay makes it easy to integrate your application with powerful search engines. However, it may be hard to diagnose why your most important content isn't showing up the way you need it to. This session will recap the key concepts for indexing and querying with Liferay Search, and present a number of techniques to guarantee your documents will be found with best possible relevance.
André de Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been a Java developer and architect for the last 15 years. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
JSON-LD 1.1 is being developed to address issues and feature requests since JSON-LD 1.0 was published over 3 years ago. Key changes in JSON-LD 1.1 include allowing objects to index into collections, framing datasets instead of just graphs, scoped contexts, and improved compact IRIs. The timeline suggests community drafts will be completed in Q4 2017 with the goal of a Working Group producing a Recommendation. Related topics that can use JSON-LD like Shape Expressions, Decentralized Identifiers, Linked Data Signatures, and Verifiable Claims were also discussed.
Media owners are turning to MongoDB to drive social interaction with their published content. The way customers consume information has changed and passive communication is no longer enough. They want to comment, share and engage with publishers and their community through a range of media types and via multiple channels whenever and wherever they are. There are serious challenges with taking this semi-structured and unstructured data and making it work in a traditional relational database. This webinar looks at how MongoDB’s schemaless design and document orientation gives organisation’s like the Guardian the flexibility to aggregate social content and scale out.
Tabular data represents a large amount of published data on the web. The W3C CSV on the Web working group aims to improve interoperability of CSV and similar tabular formats by developing specifications for metadata, data models, and conversions between CSV, JSON, and RDF. The specifications define methods for parsing CSV into structured models, associating CSV with metadata to provide data typing and relationships, and converting between tabular and other common data formats.
The document discusses GraphQL and Relay concepts including queries, mutations, fragments, and arguments. It also provides examples of GraphQL queries to fetch user and repository data, including nested and filtered data. Relay concepts like prefetch caching, server data updating, and optimistic updates are briefly mentioned as well.
The document provides an overview of GraphQL and GraphQL clients. It discusses:
- The evolution of APIs from RESTful to GraphQL, which provides a more efficient way to query complex data.
- How GraphQL uses a single endpoint and allows clients to specify exactly the data they need through queries.
- Basic GraphQL queries, including selecting fields, nested fields, arguments, variables, fragments, and mutations.
- GraphQL type definitions that serve as documentation.
- GraphQL clients like Relay that optimize data fetching and caching.
- Tools like GraphiQL that allow testing GraphQL queries in an interactive environment.
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
1) JSON-LD has seen widespread adoption with over 2 million HTML pages including it and it being a required format for Linked Data platforms.
2) A primary goal of JSON-LD was to allow JSON developers to use it similarly to JSON while also providing mechanisms to reshape JSON documents into a deterministic structure for processing.
3) JSON-LD 1.1 includes additional features like using objects to index into collections, scoped contexts, and framing capabilities.
Audio available: https://www.liferay.com/web/events-symposium-north-america/recap
Liferay makes it easy to integrate your application with powerful search engines. However, it may be hard to diagnose why your most important content isn't showing up the way you need it to. This session will recap the key concepts for indexing and querying with Liferay Search, and present a number of techniques to guarantee your documents will be found with best possible relevance.
André de Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been a Java developer and architect for the last 15 years. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
JSON-LD 1.1 is being developed to address issues and feature requests since JSON-LD 1.0 was published over 3 years ago. Key changes in JSON-LD 1.1 include allowing objects to index into collections, framing datasets instead of just graphs, scoped contexts, and improved compact IRIs. The timeline suggests community drafts will be completed in Q4 2017 with the goal of a Working Group producing a Recommendation. Related topics that can use JSON-LD like Shape Expressions, Decentralized Identifiers, Linked Data Signatures, and Verifiable Claims were also discussed.
Media owners are turning to MongoDB to drive social interaction with their published content. The way customers consume information has changed and passive communication is no longer enough. They want to comment, share and engage with publishers and their community through a range of media types and via multiple channels whenever and wherever they are. There are serious challenges with taking this semi-structured and unstructured data and making it work in a traditional relational database. This webinar looks at how MongoDB’s schemaless design and document orientation gives organisation’s like the Guardian the flexibility to aggregate social content and scale out.
Tabular data represents a large amount of published data on the web. The W3C CSV on the Web working group aims to improve interoperability of CSV and similar tabular formats by developing specifications for metadata, data models, and conversions between CSV, JSON, and RDF. The specifications define methods for parsing CSV into structured models, associating CSV with metadata to provide data typing and relationships, and converting between tabular and other common data formats.
Google Code Search was a code search engine that indexed open source code from various sources online. It allowed programmers to search code using regular expressions, keywords, and other metadata tags. However, Google discontinued the service in 2013. Popular alternatives to Google Code for hosting and searching code include GitHub, Bitbucket, and CodePlex. These services provide version control, code review, issue tracking, and other collaboration features for developers.
This document discusses using Elasticsearch for social media analytics and provides examples of common tasks. It introduces Elasticsearch basics like installation, indexing documents, and searching. It also covers more advanced topics like mapping types, facets for aggregations, analyzers, nested and parent/child relations between documents. The document concludes with recommendations on data design, suggesting indexing strategies for different use cases like per user, single index, or partitioning by time range.
Presented on 10/11/12 at the Boston Elasticsearch meetup held at the Microsoft New England Research & Development Center. This talk gave a very high-level overview of Elasticsearch to newcomers and explained why ES is a good fit for Traackr's use case.
Building Client-side Search Applications with Solrlucenerevolution
Presented by Daniel Beach, Search Application Developer, OpenSource Connections
Solr is a powerful search engine, but creating a custom user interface can be daunting. In this fast paced session I will present an overview of how to implement a client-side search application using Solr. Using open-source frameworks like SpyGlass (to be released in September) can be a powerful way to jumpstart your development by giving you out-of-the box results views with support for faceting, autocomplete, and detail views. During this talk I will also demonstrate how we have built and deployed lightweight applications that are able to be performant under large user loads, with minimal server resources.
The document discusses MongoDB transactions and concurrency. It provides code examples of how to perform transactions in MongoDB using logical sessions, including inserting a document into a collection and updating related documents in another collection atomically. It also discusses some of the features and timeline for implementing distributed transactions in sharded MongoDB clusters.
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB
Building intelligent apps involves combining real-time analytics, machine learning, and artificial intelligence to provide personalized recommendations and automate tasks for customers. Developers can use MongoDB and Google Cloud to build intelligent apps in 3 steps: 1) create a base ecommerce app, 2) add a recommendation engine using machine learning, and 3) enable shopping via chat with artificial intelligence. This brings data scientists and developers together to create applications that understand and assist customers.
Norberto Leite gives an introduction to MongoDB. He discusses that MongoDB is a document database that is open source, high performance, and horizontally scalable. He demonstrates how to install MongoDB, insert documents into collections, query documents, and update documents. Leite emphasizes that MongoDB allows for flexible schema design and the ability to evolve schemas over time to match application needs.
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...Ícaro Medeiros
In my talk I walk throgh Semantic Web initiatives, like RDF and SPARQL, linked data principles, discuss some implementation and adoption issues and talk about semantic annotation in HTML. Semantic annotation using the Schema.org vocabulary is demonstrated using both HTML 5 Microdata or JSON-LD input. There is a strong highlight in benefits seen in Google search results with Rich Snippets, Actions in Email, and Google Now with real examples.
This document proposes a content model and API to unify access to different types of content like wikis, RDF, binaries, and more. It aims to be used in projects like NEPOMUK, WAVES, and WIF. The model represents content at different levels of granularity from words to documents. Content can be annotated with semantic statements and metadata. All content is addressable and versioned. The API provides functions for basic CRUD operations as well as fulltext search and auto-completion support through a keyword index.
On Tuesday 18th March, the MongoDB team held on online Cloud Workshop in place of the in-person event which was planned.
Attendees learnt how to build modern, event driven applications powered by MongoDB Atlas in Google Cloud Platform (GCP) and were shown relevant operational and security best practices, to get started immediately with their own digital transformations.
MongoDB Europe 2016 - MongoDB 3.4 preview and introduction to MongoDB AtlasMongoDB
This document contains notes from a MongoDB conference presentation focused on improvements, extensions, and innovations to MongoDB. Key topics discussed include improvements to Wired Tiger storage engine and replica set election processes, extensions like document validation and $lookup features, and innovations like aggregation pipeline improvements and mixed storage engine sets. Demos were given on Compass UI tool and Atlas cloud database service. The presentation emphasized ongoing work to improve performance and capabilities, extend MongoDB to new domains, and innovate with features like zones and cloud-native services.
Started from the Bottom: Exploiting Data Sources to Uncover ATT&CK BehaviorsJamieWilliams130
The document discusses enhancing ATT&CK data sources by developing data models. It proposes opportunities like addressing lack of context, redundancy, and broad scope in ATT&CK data sources. It then describes a process to model adversary behavior like COR_PROFILER using relevant data sources and fields. This includes initial detection modeling, adversary simulation, and defining a detection model to validate whether relevant events are detected. Proper data modeling and mapping data sources to their elements can help identify coverage and gaps to enhance ATT&CK.
This document provides an overview of Spring Data and its support for MongoDB. Spring Data provides common repositories and abstraction for data access across NoSQL and SQL databases. It includes the MongoRepository interface which provides basic CRUD functionality for MongoDB. Custom queries can be written for MongoDB through the MongoRepository interface. Spring Data also includes the MongoTemplate class which provides a template-based API for MongoDB similar to its native driver.
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Rothamsted Research, UK
Workshop within the Integrative Bioinformatics Conference (IB2018, Harpenden, 2018).
We describe how to use Semantic Web Technologies and graph databases like Neo4j to serve life science data and address the FAIR data principles.
MongoDB Launchpad 2016: What’s New in the 3.4 ServerMongoDB
Asya Kamsky, a lead product manager at MongoDB, discussed improvements, extensions, and innovations in MongoDB. These included improvements to the Wired Tiger storage engine, replica set election process, and initial sync process. MongoDB was also extended with features like document validation, partial indexes, $lookup, read-only views, and faceted search. Innovations involved improvements to the aggregation pipeline, mixed storage engine sets, zones, and BI connectors.
Raiding the MongoDB Toolbox with Jeremy Mikola MongoDB
This document provides an overview and examples of various MongoDB tools and techniques, including full-text indexing, geospatial queries, data aggregation, creating a job queue, and using tailable cursors. It demonstrates how to create indexes, perform searches, aggregate data, select and process jobs asynchronously, and consume the oplog in real-time applications. The document is intended to help readers explore the MongoDB toolbox.
Understanding N1QL Optimizer to Tune QueriesKeshav Murthy
Every flight has a flight plan. Every query has a query plan. You must have seen its text form, called EXPLAIN PLAN. Query optimizer is responsible for creating this query plan for every query, and it tries to create an optimal plan for every query. In Couchbase, the query optimizer has to choose the most optimal index for the query, decide on the predicates to push down to index scans, create appropriate spans (scan ranges) for each index, understand the sort (ORDER BY) and pagination (OFFSET, LIMIT) requirements, and create the plan accordingly. When you think there is a better plan, you can hint the optimizer with USE INDEX. This talk will teach you how the optimizer selects the indices, index scan methods, and joins. It will teach you the analysis of the optimizer behavior using EXPLAIN plan and how to change the choices optimizer makes.
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
NOTE THAT I HAVE MOVED AWAY FROM SLIDESHARE TO ZENODO
The identical presentation is now here:
https://doi.org/10.5281/zenodo.7778641
General introduction to LinkML, The Linked Data Modeling Language.
Adapter from presentation given to NIH May 2022
https://linkml.io/linkml
Google Code Search was a code search engine that indexed open source code from various sources online. It allowed programmers to search code using regular expressions, keywords, and other metadata tags. However, Google discontinued the service in 2013. Popular alternatives to Google Code for hosting and searching code include GitHub, Bitbucket, and CodePlex. These services provide version control, code review, issue tracking, and other collaboration features for developers.
This document discusses using Elasticsearch for social media analytics and provides examples of common tasks. It introduces Elasticsearch basics like installation, indexing documents, and searching. It also covers more advanced topics like mapping types, facets for aggregations, analyzers, nested and parent/child relations between documents. The document concludes with recommendations on data design, suggesting indexing strategies for different use cases like per user, single index, or partitioning by time range.
Presented on 10/11/12 at the Boston Elasticsearch meetup held at the Microsoft New England Research & Development Center. This talk gave a very high-level overview of Elasticsearch to newcomers and explained why ES is a good fit for Traackr's use case.
Building Client-side Search Applications with Solrlucenerevolution
Presented by Daniel Beach, Search Application Developer, OpenSource Connections
Solr is a powerful search engine, but creating a custom user interface can be daunting. In this fast paced session I will present an overview of how to implement a client-side search application using Solr. Using open-source frameworks like SpyGlass (to be released in September) can be a powerful way to jumpstart your development by giving you out-of-the box results views with support for faceting, autocomplete, and detail views. During this talk I will also demonstrate how we have built and deployed lightweight applications that are able to be performant under large user loads, with minimal server resources.
The document discusses MongoDB transactions and concurrency. It provides code examples of how to perform transactions in MongoDB using logical sessions, including inserting a document into a collection and updating related documents in another collection atomically. It also discusses some of the features and timeline for implementing distributed transactions in sharded MongoDB clusters.
MongoDB World 2018: Building Intelligent Apps with MongoDB & Google CloudMongoDB
Building intelligent apps involves combining real-time analytics, machine learning, and artificial intelligence to provide personalized recommendations and automate tasks for customers. Developers can use MongoDB and Google Cloud to build intelligent apps in 3 steps: 1) create a base ecommerce app, 2) add a recommendation engine using machine learning, and 3) enable shopping via chat with artificial intelligence. This brings data scientists and developers together to create applications that understand and assist customers.
Norberto Leite gives an introduction to MongoDB. He discusses that MongoDB is a document database that is open source, high performance, and horizontally scalable. He demonstrates how to install MongoDB, insert documents into collections, query documents, and update documents. Leite emphasizes that MongoDB allows for flexible schema design and the ability to evolve schemas over time to match application needs.
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...Ícaro Medeiros
In my talk I walk throgh Semantic Web initiatives, like RDF and SPARQL, linked data principles, discuss some implementation and adoption issues and talk about semantic annotation in HTML. Semantic annotation using the Schema.org vocabulary is demonstrated using both HTML 5 Microdata or JSON-LD input. There is a strong highlight in benefits seen in Google search results with Rich Snippets, Actions in Email, and Google Now with real examples.
This document proposes a content model and API to unify access to different types of content like wikis, RDF, binaries, and more. It aims to be used in projects like NEPOMUK, WAVES, and WIF. The model represents content at different levels of granularity from words to documents. Content can be annotated with semantic statements and metadata. All content is addressable and versioned. The API provides functions for basic CRUD operations as well as fulltext search and auto-completion support through a keyword index.
On Tuesday 18th March, the MongoDB team held on online Cloud Workshop in place of the in-person event which was planned.
Attendees learnt how to build modern, event driven applications powered by MongoDB Atlas in Google Cloud Platform (GCP) and were shown relevant operational and security best practices, to get started immediately with their own digital transformations.
MongoDB Europe 2016 - MongoDB 3.4 preview and introduction to MongoDB AtlasMongoDB
This document contains notes from a MongoDB conference presentation focused on improvements, extensions, and innovations to MongoDB. Key topics discussed include improvements to Wired Tiger storage engine and replica set election processes, extensions like document validation and $lookup features, and innovations like aggregation pipeline improvements and mixed storage engine sets. Demos were given on Compass UI tool and Atlas cloud database service. The presentation emphasized ongoing work to improve performance and capabilities, extend MongoDB to new domains, and innovate with features like zones and cloud-native services.
Started from the Bottom: Exploiting Data Sources to Uncover ATT&CK BehaviorsJamieWilliams130
The document discusses enhancing ATT&CK data sources by developing data models. It proposes opportunities like addressing lack of context, redundancy, and broad scope in ATT&CK data sources. It then describes a process to model adversary behavior like COR_PROFILER using relevant data sources and fields. This includes initial detection modeling, adversary simulation, and defining a detection model to validate whether relevant events are detected. Proper data modeling and mapping data sources to their elements can help identify coverage and gaps to enhance ATT&CK.
This document provides an overview of Spring Data and its support for MongoDB. Spring Data provides common repositories and abstraction for data access across NoSQL and SQL databases. It includes the MongoRepository interface which provides basic CRUD functionality for MongoDB. Custom queries can be written for MongoDB through the MongoRepository interface. Spring Data also includes the MongoTemplate class which provides a template-based API for MongoDB similar to its native driver.
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Rothamsted Research, UK
Workshop within the Integrative Bioinformatics Conference (IB2018, Harpenden, 2018).
We describe how to use Semantic Web Technologies and graph databases like Neo4j to serve life science data and address the FAIR data principles.
MongoDB Launchpad 2016: What’s New in the 3.4 ServerMongoDB
Asya Kamsky, a lead product manager at MongoDB, discussed improvements, extensions, and innovations in MongoDB. These included improvements to the Wired Tiger storage engine, replica set election process, and initial sync process. MongoDB was also extended with features like document validation, partial indexes, $lookup, read-only views, and faceted search. Innovations involved improvements to the aggregation pipeline, mixed storage engine sets, zones, and BI connectors.
Raiding the MongoDB Toolbox with Jeremy Mikola MongoDB
This document provides an overview and examples of various MongoDB tools and techniques, including full-text indexing, geospatial queries, data aggregation, creating a job queue, and using tailable cursors. It demonstrates how to create indexes, perform searches, aggregate data, select and process jobs asynchronously, and consume the oplog in real-time applications. The document is intended to help readers explore the MongoDB toolbox.
Understanding N1QL Optimizer to Tune QueriesKeshav Murthy
Every flight has a flight plan. Every query has a query plan. You must have seen its text form, called EXPLAIN PLAN. Query optimizer is responsible for creating this query plan for every query, and it tries to create an optimal plan for every query. In Couchbase, the query optimizer has to choose the most optimal index for the query, decide on the predicates to push down to index scans, create appropriate spans (scan ranges) for each index, understand the sort (ORDER BY) and pagination (OFFSET, LIMIT) requirements, and create the plan accordingly. When you think there is a better plan, you can hint the optimizer with USE INDEX. This talk will teach you how the optimizer selects the indices, index scan methods, and joins. It will teach you the analysis of the optimizer behavior using EXPLAIN plan and how to change the choices optimizer makes.
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
NOTE THAT I HAVE MOVED AWAY FROM SLIDESHARE TO ZENODO
The identical presentation is now here:
https://doi.org/10.5281/zenodo.7778641
General introduction to LinkML, The Linked Data Modeling Language.
Adapter from presentation given to NIH May 2022
https://linkml.io/linkml
Using Spring Data and MongoDB with Cloud FoundryChris Harris
- The document discusses using Spring and MongoDB with Cloud Foundry. It covers challenges with data access like scaling horizontally and heterogeneous data needs.
- Spring Framework provides data access support for MongoDB through Spring Data. It includes APIs, object mapping, and generic repositories that improve productivity.
- Spring Data for MongoDB includes MongoTemplate for direct access, converters for mapping documents to POJOs, and MongoRepository for common CRUD operations. Examples demonstrate basic usage.
- The document shows how to integrate MongoDB documents with JPA entities for cross-store domain models and provides an example of saving to MongoDB via Spring on Cloud Foundry.
The document discusses best practices for crafting evolvable API responses. It advocates taking back control of representations by thinking of responses as messages rather than objects. This allows APIs to build payloads with just enough data to solve the problem and survive changes over time. The document explores using attribute groups, links, and established formats like HAL and JSON-LD to build representations that are minimal yet provide essential context.
Garbage collection has largely removed the need to think about memory management when you write Java code, but there is still a benefit to understanding and minimizing the memory usage of your applications, particularly with the growing number of deployments of Java on embedded devices. This session gives you insight into the memory used as you write Java code and provides you with guidance on steps you can take to minimize your memory usage and write more-memory-efficient code. It shows you how to
• Understand the memory usage of Java code
• Minimize the creation of new Java objects
• Use the right Java collections in your application
• Identify inefficiencies in your code and remove them
Video available from Parleys.com:
https://www.parleys.com/talk/how-write-memory-efficient-java-code
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQueryChris Schalk
This document introduces several new Google cloud technologies: Google Storage for storing data in Google's cloud, the Prediction API for machine learning and predictive analytics, and BigQuery for interactive analysis of large datasets. It provides overviews and examples of using each service, highlighting their capabilities for scalable data storage, predictive modeling, and fast querying of massive amounts of data.
This document provides an introduction and overview of Google App Engine and developing applications with Python on the platform. It discusses what App Engine is, who uses it, how much it costs, recommended development tools and frameworks, and some of the key services provided like the datastore, blobstore, task queues, and URL fetch. It also notes some limitations of App Engine and alternatives to running your own version of the platform.
Amazon Elasticsearch Service Deep Dive - AWS Online Tech TalksAmazon Web Services
Learning Objectives:
- Learn how to configure a secure, petabyte-scale Amazon ES cluster and ingest data into it
- Learn how to build Kibana dashboards to analyze and visualize your data in Amazon ES
- Take away best practices to make your cluster reliable, take backups, and debug slow-running queries and indexing operations
Introduction to Google Cloud platform technologiesChris Schalk
This is a presentation given by Google Developer Advocate Chris Schalk at Spring One 2GX on Oct 21st, 2010. It introduces Google Storage for Developers, Prediction API, and BigQuery.
The document discusses Google's big data and machine learning cloud products and services including:
- BigQuery for querying large datasets using SQL.
- Cloud Dataflow for parallel processing of batch and streaming data.
- Cloud ML for machine learning tasks like neural networks.
It provides demonstrations of these services using various datasets and examples of analyzing Wikipedia data and movie recommendations. The document also covers Google's vision, speech, and natural language APIs for tasks like image labeling, text transcription, and sentiment analysis.
LinkML is a modeling language for building semantic models that can be used to represent biomedical and other scientific knowledge. It allows generating various schemas and representations like OWL, JSON Schema, GraphQL from a single semantic model specification. The key advantages of LinkML include simplicity through YAML files, ability to represent models in multiple forms like JSON, RDF, and property graphs, and "stealth semantics" where semantic representations like RDF are generated behind the scenes.
Explaining the Rise of JSON-LD (machine readable JS data). Why its important and how to make sure your website has enabled…
future action buttons.
* Recent changes & examples in the wild
* Live demo of Googles mark-up validator
* GTM config files to take away & enable.
This document provides information about a MongoDB class taught by Alexandre Bergere. The class covers topics including Big Data, NoSQL, MongoDB architecture and modeling, CRUD operations, replication, security, and aggregation. It includes Alexandre's background and credentials, as well as sources and use cases for MongoDB.
The document provides an overview of a 7-week MongoDB course. It includes the course syllabus which covers topics such as CRUD operations, schema design, performance, aggregation framework, application engineering, and case studies. Key concepts taught in Week 1 include what MongoDB is, how it differs from relational databases, and how to install and use MongoDB with Python and the Bottle framework. The document also provides an example of building a "Hello World" app with MongoDB.
This document discusses tools for finding source code on the web and their usage scenarios. It introduces CodeGenie, an Eclipse plugin that searches for code examples given test cases. SAS searches for API usage examples in large code repositories given queries. Koders' Eclipse plugin searches for code examples given a method signature. Koders, Google Code Search, Krugle and Sourcerer can find examples given keywords. Exemplar searches open source projects to find relevant code examples and API usage. The document demonstrates these tools and asks for feedback in a survey.
Prairie DevCon 2015 - Crafting Evolvable API Responsesdarrelmiller71
Web frameworks help you build an API quickly but most have little support for dealing with an API that needs to evolve, forcing you to prematurely version your API. But many industry professionals are telling us not to version. How can we avoid it? Take back control of the content you send over the wire. API responses are the "user interface" of your API and should be crafted with same attention to detail that cause designers to fret over color choices, shadows and highlights. In this talk I’ll show techniques that can be used to build responses that are easier to evolve and highlight the types of practices that encourage breaking changes and force you to version your API.
Webinar: Mastering Python - An Excellent tool for Web Scraping and Data Anal...Edureka!
The free webinar on Python titled "Mastering Python - An Excellent tool for Web Scraping and Data Analysis" was conducted by Edureka on 14th November 2014
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
UiPath Test Automation using UiPath Test Suite series, part 5
BigML.io - The BigML API
1. BigML.io: The BigML API
October 12, 2012
BigML Inc BigML.io: The BigML API October 12, 2012 1 / 66
2. 1 Introduction
2 BigML Resources
3 Sources
4 Datasets
5 Models
6 Predictions
7 Evaluations
8 Bindings
9 Final Remarks
BigML Inc BigML.io: The BigML API October 12, 2012 2 / 66
3. BigML.io: Base URL
Base URL
https://bigml.io
A RESTful API for creating and managing BigML resources
programmatically.
All accesses are performed over HTTPS.
BigML Inc BigML.io: The BigML API October 12, 2012 3 / 66
4. BigML.io: Development Mode
Dev Mode
https://bigml.io/dev/
No credits are charged.
Limited to 1MB per resource but unlimited in the number of resources.
BigML Inc BigML.io: The BigML API October 12, 2012 4 / 66
5. BigML.io: Version
Version
https://bigml.io/andromeda/
BigML.io first version is named andromeda.
If you omit the version name in your API requests, you will get access to
the latest API version.
BigML Inc BigML.io: The BigML API October 12, 2012 5 / 66
6. BigML.io: Authentication
Authentication
1 BIGML_USERNAME=alfred
2 BIGML_API_KEY=62270d2ad14eba4e349432e80d749342de5550a4
3 BIGML_AUTH="username=$BIGML_USERNAME;api_key=$BIGML_API_KEY"
All accesses to BigML.io need to be authenticated.
Authentication is performed including your username and your BigML API
Key in every request.
If you use an environment variable (e.g. BIGML AUTH) you can keep your
credentials out of your source code.
BigML Inc BigML.io: The BigML API October 12, 2012 6 / 66
7. BigML Resources
Source Dataset Model Prediction
A source is a file A dataset is a A model is A prediction is
containing the structured created using a created using a
raw data that version of a dataset as model and the
you want to use data source input, selecting new instance
to create a where each which fields to that you want to
predictive column has use as input classify as input
model been assigned a and which field
type will be the
objective
BigML Inc BigML.io: The BigML API October 12, 2012 7 / 66
8. BigML.io: Source
Create a New Source
sepal length,sepal width,petal length,petal width,species
5.1,3.5,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
5.8,2.7,5.1,1.9,Iris-virginica
A source is the raw data that you want to use to create a predictive
model.
A source is usually a (big) file in tabular format.
Each column in the file represents a feature or field.
By default, the last column represents the class or objective field.
The file may have a first row or header with a name for each field.
BigML Inc BigML.io: The BigML API October 12, 2012 8 / 66
9. BigML.io: Source
Source Base URL
https://bigml.io/source
Datasources can be created using several data sources:
Local files
Remote data accessed via HTTP or HTTPs
Files in S3 buckets
Blobs in Windows Azure storage
Inline data contained in the datasource creation request
Data must be in tabular format, cannot be bigger than 64GB, and
can be compressed (.Z or .gz, but not .zip)
BigML Inc BigML.io: The BigML API October 12, 2012 9 / 66
10. BigML.io: Creating a Source using a local file
Creating a Source
curl https://bigml.io/source?$BIGML_AUTH -F file=@iris.csv
The file must be attached in the post as a file upload
The Content-Type in your HTTP request must be
multipart/form-data, as specified by RFC2388.
BigML Inc BigML.io: The BigML API October 12, 2012 10 / 66
11. BigML.io: Creating a Source using a remote URL
Creating a Remote Source
curl https://bigml.io/source?$BIGML_AUTH
-X "POST"
-H "content-type: application/json"
-d '{"remote": "https://static.bigml.com/csv/iris.csv"}'
The Content-Type in your HTTP request must be application/json.
URLs can be HTTP or HTTPS with realm authentication, public or
private Amazon S3, or Windows Azure files.
BigML Inc BigML.io: The BigML API October 12, 2012 11 / 66
12. BigML.io: Creating a Source using inline data
Creating an Inline Source
curl https://bigml.io/source?$BIGML_AUTH
-X "POST"
-H "content-type: application/json"
-d '{"data": "a,b,c,dn1,2,3,4n5,6,7,8"}'
The Content-Type in your HTTP request must be application/json.
Source data is included in the JSON body as a string with key
“data”.
Maximum size of inline sources is 10MB.
BigML Inc BigML.io: The BigML API October 12, 2012 12 / 66
14. BigML.io: Source Arguments
One Required Type Description
file multipart form data File.
remote String URL of the remote source.
data String Inline data in tabular format.
Optional Type Description
category Integer The category that best describes the data.
description String A description of the source of up to 8192 characters.
name String The name you want to give to the new source.
private Boolean Whether you want your source to be private or not.
source parser Object Set of parameters to parse the source.
tags List A list of strings that help classify and index your source.
Table : Source Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 14 / 66
15. BigML.io: Creating a Source with args
Creating a Source with args
curl https://bigml.io/source?$BIGML_AUTH
-X "POST"
-H "content-type: application/json"
-d '{"remote": "https://static.bigml.com/csv/iris.csv", "name": "iris"}'
BigML Inc BigML.io: The BigML API October 12, 2012 15 / 66
17. BigML.io: Updating a Source
Updating a Source
curl https://bigml.io/source/4f64191d03ce89860a000000?$BIGML_AUTH
-X PUT
-H 'content-type: application/json'
-d '{"name": "a new name", "source_parser": {"locale": "es-ES"}}'
BigML Inc BigML.io: The BigML API October 12, 2012 17 / 66
18. BigML.io: Deleting a Source
Deleting a Source
curl "https://bigml.io/source/4f603fe203ce89bb2d000000?$BIGML_AUTH"
-X DELETE
Response HTTP/1.1 204 NO CONTENT
BigML Inc BigML.io: The BigML API October 12, 2012 18 / 66
19. BigML.io: Retrieving a Source
Retrieving a Source via BigML.io
curl
"https://bigml.io/source/4eee50b90a590f7d5c000008?$BIGML_AUTH"
Visualizing a Source via BigML.com
https://bigml.com/dashboard/source/4eee50b90a590f7d5c000008
BigML Inc BigML.io: The BigML API October 12, 2012 19 / 66
20. BigML.io: Source Properties
property type filterable sortable updatable
category Integer yes yes yes
code Integer no no no
content type String yes yos no
created Datetime yes yes no
credits Float yes yes no
description String yes yes yes
fields Object no no no
file name String yes yes no
md5 String no no no
name String yes yes yes
number of datasets Integer yes yes no
number of models Integer yes yes no
number of predictions Integer yes yes no
private Boolean yes yes yes
resource String no no no
rows Integer yes yes no
size Integer yes yes no
source String yes yes no
source status String yes yes no
status Object no no no
tags List yes yes yes
updated Datetime yes yes no
Table : Source Properties
BigML Inc BigML.io: The BigML API October 12, 2012 20 / 66
21. BigML.io: Listing Sources
Listing Sources
curl "https://bigml.io/source?limit=10;offset=10;$BIGML_AUTH"
limit Specifies the number of sources to retrieve. Must be less
than or equal to 200.
offset The position of the whole source list at which the retrieved
source list will start off.
BigML Inc BigML.io: The BigML API October 12, 2012 21 / 66
23. BigML.io: Filtering Sources
Retrieving sources bigger than 1 MB
curl "https://bigml.io/source?size_gt=1048576;$BIGML_AUTH"
Filter Description
lt Less than
lte Less than or equal to
gt Greater than
gte Greater than or equal to
Table : Filtering Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 23 / 66
24. BigML.io: Sorting Sources
Sorting sources by size
curl "https://bigml.io/source?order_by=-size;$BIGML_AUTH"
order by Specifies the order of the sources to retrieve. Must be one
of the sortable fields. If you prefix the field name with “-”,
the order will be descending.
BigML Inc BigML.io: The BigML API October 12, 2012 24 / 66
25. BigML.io: Dataset
Dataset Base URL
https://bigml.io/dataset
A dataset is a structured version of a source where each field has
been processed and serialized according to its type.
A field can be numeric or categorical.
Datetime and text fields are coming down the pike.
BigML Inc BigML.io: The BigML API October 12, 2012 25 / 66
26. BigML.io: Create a New Dataset
Create a New Dataset
curl "https://bigml.io/andromeda/dataset?$BIGML_AUTH"
-X POST
-H 'content-type: application/json'
-d '{"source": "/source/4ee5761c80e1c664f1000000"}'
BigML Inc BigML.io: The BigML API October 12, 2012 26 / 66
28. BigML.io: Dataset Arguments
Required Type Description
source String Valid source/id
Optional Type Description
category Integer The category that best describes the dataset.
description String A description of the dataset of up to 8192 characters.
fields Object The fields that you want to use to create the dataset.
name String Name of the dataset.
private Boolean Whether you want your dataset to be private or not.
size Integer Maximum number of bytes to process.
tags List A list of strings that help classify and index your dataset.
Table : Dataset Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 28 / 66
29. BigML.io: Creating a Dataset with args
Creating a Dataset with args
curl "https://bigml.io/dataset?$BIGML_AUTH"
-X POST
-H 'content-type: application/json'
-d '{"source": "/source/4ee5761c80e1c664f1000000", "name": "my dataset"}'
BigML Inc BigML.io: The BigML API October 12, 2012 29 / 66
30. BigML.io: Updating a Dataset
Updating a Dataset
curl https://bigml.io/dataset/4f66a0b903ce8940c5000000?$BIGML_AUTH
-X PUT
-H 'content-type: application/json'
-d '{"name": "a new name"}'
BigML Inc BigML.io: The BigML API October 12, 2012 30 / 66
31. BigML.io: Deleting a Dataset
Deleting a Dataset
curl "https://bigml.io/dataset/4f66a0b903ce8940c5000000?$BIGML_AUTH"
-X DELETE
Response HTTP/1.1 204 NO CONTENT
BigML Inc BigML.io: The BigML API October 12, 2012 31 / 66
32. BigML.io: Retrieving a Dataset
Retrieving a Dataset via BigML.io
curl "https://bigml.io/dataset/4f66a0b903ce8940c5000000?$BIGML_AUTH"
Retrieving a Dataset via BigML.com
https://bigml.com/dashboard/dataset/4f66a0b903ce8940c5000000
BigML Inc BigML.io: The BigML API October 12, 2012 32 / 66
33. BigML.io: Dataset Properties
property type filterable sortable updatable
category Integer yes yes yes
code Integer no no no
columns Integer yes yes no
created Datetime yes yes no
credits Float yes yes no
description String yes yes yes
fields Object no no no
locale String no no no
name String yes yes yes
number of models Integer yes yes no
number of predictions Integer yes yes no
private Boolean yes yes yes
resource String no no no
rows Integer yes yes no
size Integer yes yes no
source String yes yes no
source status Boolean yes yes no
status Object no no no
tags List yes yes yes
updated Datetime yes yes no
Table : Dataset Properties
BigML Inc BigML.io: The BigML API October 12, 2012 33 / 66
34. BigML.io: Listing Datasets
Listing Datasets
curl "https://bigml.io/dataset?limit=10;offset=10;$BIGML_AUTH"
limit The total number of datasets to retrieve (≤ 200).
offset The offset at which the dataset listing will start.
BigML Inc BigML.io: The BigML API October 12, 2012 34 / 66
36. BigML.io: Filtering Datasets
Retrieving datasets bigger than 1 MB
curl "https://bigml.io/dataset?size_gt=1048576;$BIGML_AUTH"
Filter Description
lt Less than
lte Less than or equal to
gt Greater than
gte Greater than or equal to
Table : Filtering Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 36 / 66
37. BigML.io: Sorting Datasets
Sorting datasets by size
curl "https://bigml.io/dataset?order_by=-size;$BIGML_AUTH"
order by Specifies the order of the datasets to retrieve. Must be one
of the sortable fields. If you prefix the field name with “-”,
they will be given in descending order.
BigML Inc BigML.io: The BigML API October 12, 2012 37 / 66
38. BigML.io: Model
Model Base URL
https://bigml.io/model
A model is a tree-like representation of your dataset with
predictive power.
You can create a model selecting which fields from your dataset
you want to use as input fields (or predictors) and which field you
want to predict, the objective field.
BigML Inc BigML.io: The BigML API October 12, 2012 38 / 66
39. BigML.io: Create a New Model
Create a New Model
curl https://bigml.io/model?$BIGML_AUTH
-X POST
-H 'content-type: application/json'
-d '{"dataset": "dataset/4f66a80803ce8940c5000006"}'
BigML Inc BigML.io: The BigML API October 12, 2012 39 / 66
40. New Model
1 { "category": 0,
2 "code": 201,
3 "columns": 5,
4 "created": "2012-05-25T07:13:07.243623",
5 "credits": 0.03515625,
6 "dataset": "dataset/4f66a80803ce8940c5000006",
7 "dataset_status": true,
8 "description": "",
9 "holdout": 0.0,
10 "input_fields": [],
11 "locale": "en_US",
12 "max_columns": 5,
13 "max_rows": 150,
14 "name": "iris' dataset model",
15 "number_of_predictions": 0,
16 "objective_fields": [],
17 "private": true,
18 "range": [
19 1, 150
20 ],
21 "resource": "model/4f67c0ee03ce89c74a000006",
22 "rows": 150,
23 "size": 4608,
24 "source": "source/4f665b8103ce8920bb000006",
25 "source_status": true,
26 "status": {
27 "code": 1, "message": "The model is being processed and will be created soon"
28 },
29 "tags": [],
30 "updated": "2012-05-25T07:13:07.243658" }
BigML Inc BigML.io: The BigML API October 12, 2012 40 / 66
41. BigML.io: Model Arguments
Required Type Description
dataset String Valid dataset/id
Optional Type Description
category Integer The category that best describes the dataset.
description String A description of the dataset of up to 8192 characters.
input fields List The fields that you want to use to create the model.
name String Name of the dataset.
objective fields List The field that you want to predict.
private Boolean Whether you want your dataset to be private or not.
range List The range of successive instances to build the model.
tags List A list of strings that help classify your dataset.
Table : Model Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 41 / 66
42. BigML.io: Creating a Model with args
Creating a Model with args
curl https://bigml.io/andromeda/model?$BIGML_AUTH
-X POST
-H 'content-type: application/json'
-d '{"dataset": "dataset/4f66a80803ce8940c5000006", "input_fields": ["000001", "000003"]}'
BigML Inc BigML.io: The BigML API October 12, 2012 42 / 66
43. BigML.io: Updating a Model
Updating a Model
curl https://bigml.io/model/4f67c0ee03ce89c74a000006?$BIGML_AUTH
-X PUT
-H 'content-type: application/json'
-d '{"name": "a new name"}'
BigML Inc BigML.io: The BigML API October 12, 2012 43 / 66
44. BigML.io: Deleting a Model
Deleting a Model
curl "https://bigml.io/model/4f67c0ee03ce89c74a000006?$BIGML_AUTH"
-X DELETE
Response HTTP/1.1 204 NO CONTENT
BigML Inc BigML.io: The BigML API October 12, 2012 44 / 66
45. BigML.io: Retrieving a Model
Retrieving a Model via BigML.io
curl "https://bigml.io/model/4f66a80803ce8940c5000006?$BIGML_AUTH"
Retrieving a Model via BigML.com
https://bigml.com/dashboard/model/4f66a80803ce8940c5000006
BigML Inc BigML.io: The BigML API October 12, 2012 45 / 66
46. BigML.io: Model Properties
property type filterable sortable updatable
category Integer yes yes yes
code Integer no no no
columns Integer yes yes no
created Datetime yes yes no
credits Float yes yes no
dataset String yes yes no
dataset status Boolean yes yes no
description String yes yes yes
input fields Object no no no
locale String no no no
max columns Integer yes yes no
max rows Integer yes yes no
model Object no no no
name String yes yes yes
number of predictions Integer yes yes no
objective fields List no no no
private Boolean yes yes yes
range List no no no
resource String no no no
size Integer yes yes no
statistical pruning Boolean yes yes no
status Object no no no
tags List yes yes yes
updated Datetime yes yes no
Table : Model Properties
BigML Inc BigML.io: The BigML API October 12, 2012 46 / 66
47. BigML.io: Listing Models
Listing Models
curl "https://bigml.io/model?limit=10;offset=10;$BIGML_AUTH"
limit The number of models to retrieve (≤ 200).
offset The offset at which the model listing will start off.
BigML Inc BigML.io: The BigML API October 12, 2012 47 / 66
49. BigML.io: Filtering Models
Retrieving models bigger than 1 MB
curl "https://bigml.io/model?size_gt=1048576;$BIGML_AUTH"
Filter Description
lt Less than
lte Less than or equal to
gt Greater than
gte Greater than or equal to
Table : Filtering Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 49 / 66
50. BigML.io: Sorting Models
Sorting models by size
curl "https://bigml.io/model?order_by=-size;$BIGML_AUTH"
order by Specifies the order of the models to retrieve. Must be one
of the sortable fields. If you prefix the field name with “-”,
they will be given in descending order.
BigML Inc BigML.io: The BigML API October 12, 2012 50 / 66
51. BigML.io: Prediction
Prediction Base URL
https://bigml.io/prediction
A prediction is created using a model/id and the properties of the
new instance (input data) for which you wish to create a prediction.
To create a new prediction, BigML.io will automatically navigate the
corresponding model to find the leaf node that best classifies the
new instance.
BigML Inc BigML.io: The BigML API October 12, 2012 51 / 66
52. BigML.io: Create a New Prediction
Create a New Prediction
curl https://bigml.io/prediction?$BIGML_AUTH
-X POST
-H 'content-type: application/json'
-d '{"model": "model/4f67c0ee03ce89c74a000006",
"input_data": {"000001": 3}}'
BigML Inc BigML.io: The BigML API October 12, 2012 52 / 66
54. BigML.io: Prediction Arguments
Required Type Description
model String Valid model/id.
input data Object Field’s id/value pairs representing the instance.
Optional Type Description
category Integer The category that best describes the dataset.
description String A description of the dataset of up to 8192 characters.
name String Name of the dataset.
private Boolean Whether you want your dataset to be private or not.
tags List A list of strings that help classify and index your dataset.
Table : Prediction Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 54 / 66
55. BigML.io: Creating a Prediction with args
Creating a Prediction with args
curl https://bigml.io/andromeda/prediction?$BIGML_AUTH
-X POST
-H 'content-type: application/json'
-d '{"input_data": {"000001": 3},
"model": "model/4f67c0ee03ce89c74a000006",
"name": "my prediction"}'
BigML Inc BigML.io: The BigML API October 12, 2012 55 / 66
56. BigML.io: Updating a Prediction
Updating a Prediction
curl https://bigml.io/prediction/4f6a014b03ce89584500000f?$BIGML_AUTH
-X PUT
-H 'content-type: application/json'
-d '{"name": "a new name"}'
BigML Inc BigML.io: The BigML API October 12, 2012 56 / 66
57. BigML.io: Deleting a Prediction
Deleting a Prediction
curl "https://bigml.io/prediction/4f6a014b03ce89584500000f?$BIGML_AUTH"
-X DELETE
Response HTTP/1.1 204 NO CONTENT
BigML Inc BigML.io: The BigML API October 12, 2012 57 / 66
58. BigML.io: Retrieving a Prediction
Retrieving a Prediction via BigML.io
curl "https://bigml.io/prediction/4f6a014b03ce89584500000f?$BIGML_AUTH"
Retrieving a Prediction via BigML.com
https://bigml.com/dashboard/prediction/4f6a014b03ce89584500000f
BigML Inc BigML.io: The BigML API October 12, 2012 58 / 66
59. BigML.io: Prediction Properties
property type filterable sortable updatable
category Integer yes yes yes
code Integer no no no
created Datetime yes yes no
credits Float yes yes no
dataset String yes yes no
dataset status Boolean yes yes no
description String yes yes yes
fields Object no no no
input data Object no no no
locale String no no no
model String yes yes no
model status Boolean yes yes no
name String yes yes yes
objective fields List yes yes no
prediction Object yes yes no
prediction path Object no no no
private Boolean yes yes yes
resource String no no no
source String yes yes no
source status Boolean yes yes no
status Object no no no
tags List yes yes yes
updated Datetime yes yes no
Table : Prediction Properties
BigML Inc BigML.io: The BigML API October 12, 2012 59 / 66
60. BigML.io: Listing Predictions
Listing Predictions
curl "https://bigml.io/prediction?limit=10;offset=10;$BIGML_AUTH"
limit The number of predictions to retrieve (≤ 200).
offset The offset at which the prediction listing will start off.
BigML Inc BigML.io: The BigML API October 12, 2012 60 / 66
62. BigML.io: Filtering Predictions
Retrieving predictions created after 12/1/2012
curl "https://bigml.io/prediction?created__gt=2012-01-12;$BIGML_AUTH"
Filter Description
lt Less than
lte Less than or equal to
gt Greater than
gte Greater than or equal to
Table : Filtering Arguments
BigML Inc BigML.io: The BigML API October 12, 2012 62 / 66
63. BigML.io: Sorting Predictions
Sorting predictions by name
curl "https://bigml.io/prediction?order_by=-name;$BIGML_AUTH"
order by Specifies the order of the predictions to retrieve. Must be
one of the sortable fields. If you prefix the field name with
“-”, they will be given in descending order.
BigML Inc BigML.io: The BigML API October 12, 2012 63 / 66
64. BigML.io: Evaluation
Evaluation Base URL
https://bigml.io/evaluation
An evaluation automatically measures the performance of a model
correctly predicting the objective field for a pre-labeled test set.
An evaluation is created using the model/id of the model under
evaluation and the a dataset/id of the testset.
BigML Inc BigML.io: The BigML API October 12, 2012 64 / 66
65. BigML.io: Public Bindings
Bash https://github.com/bigmlcom/bigml-bash
Python https://github.com/bigmlcom/python
R https://github.com/bigmlcom/bigml-r
iOS https://github.com/fgarcialainez/ML4iOS
Java https://github.com/javinp/bigml-java
Ruby http://vigosan.github.com/big ml/
BigML Inc BigML.io: The BigML API October 12, 2012 65 / 66
66. BigML.io: Final Remarks
dev mode Remember to include /dev in your URL requests to avoid
credit charges.
version Remember to include the current version name
/andromeda in your URL requests to make sure that
future versions of the BigML API do not interfere with your
application.
BigML Inc BigML.io: The BigML API October 12, 2012 66 / 66