Global Graph Summit Sessions

These are the sessions already confirmed for the Global Graph Summit. For a list of all the sessions included with your Data Day Texas ticket, visit the Data Day Texas session list.

Graph Keynote - From Theory to Production

Dr. Denise Gosnell - DataStax

We are here to build applications with graph data and deliver value. The graph community has spent years defining and describing our passion. In order to decipher graph thinking into a production application, there is a suite of hard decisions that have to be made. It's time for graph to go mainstream!
This talk will walk through some practical and tangible decisions that come into play when shipping distributed graph applications. Developers need to have a tangible set of play books to work from and my years of experience have narrowed it down to some of the most universal and difficult to spot. Let's see how well they match up with yours.

The Intelligent Sales Organization Runs on Speech Recognition, Knowledge Graphs and AI.

Dr. Jans Aasman - Franz Inc. / Shannon Copeland - N3

We describe a real world Intelligent Sales Organization that uses graph based technology for taxonomy driven entity extraction, speech recognition, machine learning and predictive analytics to improve quality of conversations, increase sales and improve business visibility.
The details:In the typical sales organization the contents of the actual chat or voice conversation between agent and customer is a black hole. In the modern Intelligent Sales Organization (“ISO”) the interactions between agent and customer are a source of rich information that helps agents to improve the quality of the interaction in real time, creates more sales, and provides far better analytics for management. An ISO is enabled by at least five main technologies. A taxonomy of the products and services sold, speech recognition to turn conversations into text, a taxonomy driven entity extractor to take the important concepts out of conversations, and machine learning to classify chats in various ways. All of this is stored in a real-time Knowledge Graph that also knows (and stores) everything about customers and agents and provides the raw data for machine learning to improve doing the business of ISO.

Intro to Graph Databases for Data Scientists

Dave Bechberger - DataStax

With the rise of graph databases, graphs are no longer just a data structure but a powerful set of capabilities at the persistence layer which data scientists can leverage to accelerate the speed to insight. Unlike the relational world, we can now create graph models that work on 10k records or 10’s of millions of records and do so in real time.
In this talk, we will walk someone familiar with building for relational databases through the process of how to start leveraging the power graph databases. We will talk about what makes a good and bad use cases, how to building effective models, and how to prevent your developers and data engineers from pulling their hair out when it goes to production.

90 minute workshop - Hands-On Introduction to Gremlin Traversals

Dr. Artem Chebotko - DataStax

This workshop introduces the Gremlin graph traversal language from Apache TinkerPop by exploring graph access patterns that are commonly seen in real-life applications. It features many practice problems, where the access pattern complexity is gradually increased from elements to paths, from paths to subgraphs, and from subgraphs to arbitrary graph patterns.
This workshop is hands-on. Each attendee gets a free cloud instance with pre-installed database and notebook software that can be accessed via a web browser to run Gremlin traversal examples and complete practice problems. A laptop is required to participate but is not absolutely necessary to learn and benefit.
This workshop has no vendor-specific content to learn or understand but does use a graph database, DataStax Enterprise Graph, and notebook software, DataStax Studio, to run examples and test solutions to practice problems. All practice problems use Apache TinkerPop’s Gremlin with no vendor-specific extensions.
Intended audience: Beginners and Intermediate
Technical skills and concepts required: No prior graph data management experience is required.

A Scalable Graph Database Platform with ArangoDB on Kubernetes

Michael Hackstein - ArangoDB

Many applications today rely on highly connected data consisting of edges & vertices. Context and semantics become more and more important for fraud detection, recommendation systems, identity & access management or neural networks in artificial intelligence. Graph datasets can quickly outgrow the capabilities of a single machine.
Ideally, we would like to move to a cluster of small, cheap machines. But how to overcome the network hop problem when data queried resides on different machines?
Kubernetes has become the leading orchestration system to run containers in the cloud. In the past, running stateful applications was considered “difficult”. Recent developments in Kubernetes like Persistent Volume Claims, Custom Resource Definitions and Service Operators enabled the creation of advanced solutions to stateful services.
Michael will show developers, DevOps, Data Scientists and all interested folk how to deploy & run a distributed graph database with only 7 lines of yaml code on Kubernetes. Furthermore, he will show live on stage how to scale a graph database to billions of nodes & edges while preserving fast query execution.

Statistically representative graph generation and benchmarking

Chris Lu

In order to evaluate the performance of a graph database or a graph query solution, it is often necessary to generate a large graph dataset, over which we can then execute queries. However, while there are a number of benchmarks for graph databases which provide tools for data generation, they typically offer few if any options for tailoring the generated graph to the unique schema, topology, and statistics of a the target domain. In practice, this limits the value of these benchmarks for capacity planning and estimation of query latency. In this talk, we will describe an open-source framework for property graph generation and benchmarking in close correspondence with a schema and a statistical model. A simple declarative language is provided for schema and model, while the reference implementation is written in Java and builds upon the Apache TinkerPop graph database API.
Intended audience: Graph developers
Technical skills and concepts required: Familiarity with the property graph data model. Some experience with graph database backends recommended though not required.

Operationalizing Graph Analytics With Neo4j

William Lyon - Neo4j

Data science is great and all, but when it comes time to implement some of the advanced features data scientists have prototyped, developers can be left struggling. This talk will show how data scientists and developers can live in harmony by using Neo4j graph database for both operational and analytic workloads.
Taking a tour through the process of making an application smarter with personalized recommendations, graph based search, and knowledge graph features we will move beyond just operational workloads to add these features to our application with graph algorithms, using Neo4j for HTAP (hybrid transaction analytics processing).
We demonstrate how to run graph algorithms such as personalized PageRank, community detection, and similarity metrics in Neo4j and how these algorithms can be used to improve our application. We'll show how to keep our application up to date as data comes into the system, discuss architectural considerations, and enhance data scientists’ capabilities by building user-friendly applications.

Practical Graph Algorithms

Rob McDaniel - (Stealth)

"Graphs are everywhere." We've all heard it. But what can you do with them? By now everyone is familiar with the common analytics use-cases and traversal scenarios, but this talk will shift gears to discuss actual, practical graph scenarios. Topics will include simple techniques for comparing graphs by similarity, common data mining approaches for identifying and locating subgraphs, as well as a basic introduction to graph partitioning. Can't remember what an eigenvector is, or forgotten why are they so cool? Curious what they have to do with graphs? This is the talk for you. Linear algebra not required.

Moving Beyond Node Views

Lynn Pausic - Expero

When people talk about visualizing graph data, what typically comes to mind is the canonical node view. Node views display nodes (vertex) and the relationships (edge) between them. With large data sets consisting of millions of vertices and edges, node views can quickly become unwieldy to use and comprehend.Further, traditional UI patterns and visualizations conceived for relational schemas often don’t work with graph data. Relational schemas are predefined and relatively static making it easy to tailor UI navigation to the available data dimensions. Due to the distinct mathematical nature of graph data, traversing data in a graph is fairly different. While this presents additional challenges, there are also opportunities. Traversing a graph with certain algorithms allows you to, for example, show key influencers in social networks, clusters of communities in customer reviews or weak points in electrical grids. These new insights into data provide novel tools to craft innovate user experiences. But this opportunity comes at a price, namely more complexity. Through building and deploying dozens of applications driven by graph data, we’ve developed a unique approach to building UIs driven by graph data and an arsenal of data visualizations that work well across broad range of contexts. In this talk we’ll share various tools and examples for displaying graph data in meaningful ways to users.

And Bad Mistakes, I’ve Made a Few: Experience from the Trenches as a Graph Data Architect

Josh Perryman - Expero

For four years Josh has crossed the country and traveled the globe working on graph data projects of all sizes, in a variety of industries, with several different engines. And he’s not been alone. Expero has several graph data architects who have worked with a host of client projects. The best of their combined experiences are collected in this one talk.
In this session Josh will cover several specific lessons learned in the pursuit of client success on the frontiers of connected data technology. These include fables such as: “graph is always awesomer than relational databases (except when it isn’t)”, “my practical access patterns beat up your elegant schema”, “the trick to fast ingest is to store no data (and the client loved us)”, ”I’ve got write amplification and I know just how to use it”, “you really can have too many vertex labels in the model”, and the classic tale: “sometimes the best edge is the one that only works one way”.

GQL: Towards a Standardized Property Graph Query Language

Dr. Petra Selmer - Neo4j

Over the past decade, property graph databases have experienced phenomenal growth within industry across multiple domains such as master data and knowledge management, fraud detection, network management, access control and healthcare, among others. With this proliferation in usage, the need for a standardized property graph query language has become ever more pressing. Efforts are underway within the ISO framework to define and describe a standard property graph query language, GQL. In this talk, I will introduce GQL, and detail the landscape, scope and features envisaged for the first version of GQL, such as complex pattern matching and composable graph querying. I will provide a roadmap of the standardization process, and also describe the outcome of an analytical comparison of existing property graph query languages, which will be one of the inputs into the design of GQL. To conclude, I will outline future directions.
Technical skills and concepts required: Some knowledge/awareness of property graphs would be useful.

A Graph is a Graph is a Graph: Equivalence, Transformations, and Composition of Graph Data Models

Joshua Shinavier - Uber

The power of graphs lies in their intuitiveness: there is nothing much simpler to visualize or reason about than a bunch of dots connected by a bunch of lines. In practice, however, there are a variety of graph data models, separated by shades of expressivity and nuance. These include property graphs and their variants, RDF-based ontology languages, hypergraph data models, entity-relationship models, and any number of formats and schema languages which are somehow graph-like, though not specifically designed for graphs. Over the years, countless special-purpose tools have been written to transform one graph data model to another, or to provide graph views over this or that type of data source. In this talk, we will bring some order to this chaos using concepts from functional programming and category theory, with an emphasis on bidirectional and composable transformations. Along the way, we will ponder the grand vision of bringing together the whole of a company’s data as a knowledge graph.
Technical skills and concepts required: Basic familiarity with the property graph data model. Some experience with functional programming, may help. However, concepts will be introduced at a high level, and should be reasonably easy to follow.

Predicting new edges in large scale dynamic graphs

Gabriel Tanase - Graphen

Lots of enterprises today are using graph databases and graph algorithms to model and solve complex problems. At Graphen we are successfully using graphs in the Fintech, Cybersecurity and Health domains. With knowledge extracted from graphs we improve accuracy of various machine learning pipelines when predicting non performing loans, money laundering or users that behave outside of their predefined ‘normal’ activities (intrusion detection).
In this talk we focus on predicting if a new connection will appear between two entities in a graph (link prediction). This is the high level concept we use when predicting non performing loans in a bank but the concept itself can be applied to various other domains. For example, one can be interested in predicting what two authors may co-author a paper in the coming year, or who will be friends with who in a social network, even to predict who may acquire a certain item (product recommendation). It turns out that link prediction is a highly computational problem, requiring lots of complex algorithms to extract features and perform machine learning. In this talk we will show how a data scientist can implement a link prediction pipeline and experiment with different features to obtain the best results for his particular domain. We exemplify using a DPLB co-authorship dataset available online.

Breaking Down Silos with Knowledge Graphs

Michael Uschold - Semantic Arts

For most of you who work in enterprise computing, silos are the bane of your existence. We explore the origins of silos and point out some technical factors that exert a strong gravitational pull, drawing your enterprise into the deep pit of silos. Chief among these, are application-centricity along with limitations in relational database technology including the lack of explicit semantics. We describe a semantics-based data-centric approach using ontologies and knowledge graphs. We show how it works in practice and illustrate with case studies. We warn against the use of these newer technologies to gain a local advantage in an organization but ultimately recreating silos across the wider enterprise.
Conclusion: The use of an enterprise ontology as a schema to populate an RDF-based knowledge graph opens the door to removing silos and never creating them again. The technology is mature and ready for prime time.

High Performance JanusGraph Batch & Stream Loading

Ted Wilmes - Expero

You've downloaded JanusGraph, installed it, and run a few queries from the Gremlin console, but what's next? Data loading is the logical next step, but it is a common pain point for JanusGraph newcomers. Inevitably data loading touches on more advanced topics such as performance tuning and an understanding of JanusGraph transaction semantics. This talk will demystify the data loading process by presenting JanusGraph batch and stream loading patterns that you can apply to your next graph database project.

Taming of the Shrew: Using a Knowledge Graph to capture structured Health Information Data

Chris Wixon - Savannah Vascular

Despite investing billions of dollars on modernizing health information infrastructure, the impact on healthcare outcome has been relatively modest. Patient care remains fragmented, costs continue to rise and medical professionals remain frustrated and dissatisfied. Our success in a new era of digital health will depend on the ability to derive insight from data.
Freedom of expression comes from order and structure. Placing limits, and working against them rigorously. Our graph-powered solution offers a schema-driven method of information capture between patient and provider, at the point of service. In contradistinction to a source-oriented method, the knowledge graph models the corpus of medicine itself and incorporates concepts from multiple medical terminology systems (CPT, ICD, SNOMED, NPI, Medical taxonomy) into its persistence layer. A seed pattern of clinically-relevant predicates defines concepts in a formal way to reflect the semantics of the concept. The meta-model supports a uniform user interface and enables efficient documentation by way of semantic browser: click and discover, rather than search and retrieval.
Implementing a clinically oriented Knowledge Graph saves time for the physician, returns the focus back to the patient, and creates computable medical records for healthcare payers to make more informed decisions.
Point of service documentation minimizes the time lag between the patient encounter and data entry and provides more complete and reliable health information data.
The use of the computer as a recall mechanism for medical knowledge allows medical personnel to function at the top of their credentials from a universe that is more comprehensive than human cognition.
The data gathered in a specific section of the record should be entered in an unambiguous and consistent among all health practitioners entering it.
Intended audience: Healthcare Professionals, Data Modeling, Domain-driven design (DDD), Knowledge Graph
Technical skills and concepts required: Beginner – Intermediate knowledge

Eight Prerequisites of a Graph Query Language

Dr. Mingxi Wu - TigerGraph

Graph query language is the key to unleashing the value of interconnected data. The talk includes discussion of 8 prerequisites of a graph query language for successful implementation of real world graph analytics use cases. The talk will present the pros and cons of three query languages - Cypher, Gremlin, and SPARQL. Finally, the talk will provide an overview of GSQL, a Turing Complete graph query language that is a conceptual descendent of Cypher, Gremlin and SPARQL and has incorporated design features from SQL as well as Hadoop MapReduce. The talk will compare GSQL query language with Gremlin, Cypher and SparkQL, pointing out the differences including pros and cons for each language.