Graph Day SF 2018 sessions

We're just now beginning to announce the sessions. Expect many updates over the next few weeks.

Navigating Time and Probability in Knowledge Graphs

Jans Aasman - Franz, Inc.

The market for knowledge graphs is rapidly developing and evolving to solve widely acknowledged deficiencies with data warehouse approaches. Graph databases are providing the foundation for these knowledge graphs and in our enterprise customer base we see two approaches forming: static knowledge graphs and dynamic event driven knowledge graphs. Static knowledge graphs focus mostly on metadata about entities and the relationships between these entities but they don’t capture ongoing business processes. DBPedia, Geonames and Census or Pubmed are great examples of static knowledge.
Dynamic knowledge graphs are used in the enterprise to facilitate internal processes, facilitate the improvement of products or services or gather dynamic knowledge about customers. I recently authored an IEEE article describing this evolution of knowledge graphs in the Enterprise and during this presentation I will describe two critical success factors for dynamic knowledge graphs, a uniform way to model, query and interactively navigate time and the power of incorporating probabilities into the graph. The presentation will cover three use cases and live demos showing the confluence of knowledge via machine learning, visual querying, distributed graph databases, and big data not only displays links between objects, but also quantifies the probability of their occurrence.

Graph Database + Legacy Application = Hard

Dave Bechberger - GeneByGene

"Let's use a graph to add special project alpha to our product" - Big Boss
"Our application is 15 years old and built on SQL server" - Team
"This project alpha was funded as a graph problem so we are using one" - Big Boss

No matter if you are new to using graph databases or a seasoned veteran implementing new technology into legacy applications is hard. Implementing one as relatively new, and unknown, as graph databases is even harder. The reality is that almost every applications is a legacy application and in order to make them better, taking the hard path turns out to be the best approach.
In this talk, we will walk through both pleasant and painful experiences adding graph databases into legacy applications. Dave will share war stories, battle scars and walk through common patterns and anti-patterns to help you forge your roadmap to success. In the end you will hopefully come away with a sense of what warning signs tp watch out for when starting this kind of project and a better understanding of what not to do.

Building a Knowledge Graph

Dan Bennett - Thomson Reuters

Just a few years ago a knowledge graph was the domain of academic papers, today they underpin the natural language capabilities of Alexa, Siri, Cortana and Google Now. Graphs are a natural fit for this use case: treating every data item as equivalent and embracing rapid schema mutation. For the past few years, Thomson Reuters has been building a professional information knowledge graph to power our next generation of products. Our graph is RDF based, fast growing and supports a number of different products and user experiences. In this session, Dan will cover our experiences, architecture, tools and lessons learned from building, integrating and maintaining a 100bn triple graph.

From Theory to Production

Dr. Denise Gosnell - DataStax

We are here to build applications with graph data and deliver value. The graph community has spent years defining and describing our passion. In order to decipher graph thinking into a production application, there is a suite of hard decisions that have to be made. It's time for graph to go mainstream!
This talk will walk through some practical and tangible decisions that come into play when shipping distributed graph applications. Developers need to have a tangible set of play books to work from and my years of experience have narrowed it down to some of the most universal and difficult to spot. Let's see how well they match up with yours.

Comparing GraphFrames access methods in DSE Graph

Jim Hatcher - DataStax

GraphFrames is a powerful feature in Spark that allows you to harness Spark's distributed computing framework to operate on your Graph. Tasks like data ingestion, schema migrations, and analytical jobs can all be run against your Graph. In DSE Graph, there are several methods to leverage GraphFrames including Gremlin, Spark SQL, and Motif. In this talk, Jim will walk through the basics of using GraphFrames with DSE Graph; he will thenl show how these different methods can be used and how you can evaluate which one is the best for your use case.

Graph Based Malware Analysis

Florian Hockmann / Stefan Hausotte- G DATA Software

We will present our use case for graph databases where we search for similar malware samples based on their behavior. As an anti-virus vendor, we analyze several hundred thousand of potential malware samples per day. These samples belong to only a few malware families whose members share a lot of behavior features. We use this fact to cluster all samples together that belong to the same family by connecting all samples that exhibit the same features via those common features. The behavior features are extracted from malware samples with the help of automatic analysis tools and inserted into a JanusGraph database. This talk shows the advantages a graph database has to offer for automatic and manual malware analysis.

How Do *You* Graph? - Minimizing Developer Impedance

Ben Krug - DataStax

We're often told that graph databases are entirely different from relational and other databases. Graph traversal tools, like Gremlin, often look and feel imperative, whereas tools like SQL are basically declarative. But are they really completely different?
Upon examination, the real issue is - which approaches will help a developer obtain the results they need most quickly and efficiently.
We will contrast different views of, and access methods to, graph data, focusing on examples from Tinkerpop's Gremlin (does traversals, feels imperative, and graphy) and Apache's Spark SQL (does queries, feels declarative, RDBMSy). Gremlin can be leveraged from a variety of programming languages, thanks in part to Gremlin Language Variants, or from a gremlin console. Spark SQL makes use of Spark architecture and concepts, and allows you to build on existing relational experience.
There are many ways to look at and use any data, including graph data. This talk will consider some different graph approaches, their strengths and weaknesses, and demonstrate the use of each to accomplish "graphy" tasks.
Intended audience: developers or admins, to give an overview of the tools and their use, and help elucidate which tools may be the best fit for them.
Required skills: some familiarity with relational and graph databases

Data Modeling with an FU to Super Nodes

Jonathan Lacefield - DataStax

Graph databases are receiving a lot of hype these days because of the promise of fast and flexible queries that aren’t possible within either traditional RDBMs or NoSQL stores built on simple/singular access patterns. There are some practical tips and tricks that ensure that your graph database project is going to live up to the hype. In this talk, we will walk through the data modeling tips and tricks that are being used to help graph users achieve success. We’ll also highlight how to avoid the largest graph problem that can plague any graph database project, the dreaded supernode. This will be a demo led presentation with lots of examples. Beginners to advanced participants are welcomed as there’s something to learn for everyone.

How to Destroy Your Graph Project with Terrible Visualization

Christian Miles - Cambridge Intelligence

We are all using graphs for a reason - in many cases, it's because the graph model presents an intuitive view of the data. Unfortunately, the most elegant graph data models can often be stymied by bad visualizations that obscure rather than enlighten. In this talk, Christian Miles will discuss a number of bad practices in graph visualization that are surprisingly common. He will then outline graph visualization best practices to help create visual interfaces to graph data that convey useful insight into the data.

Visualizing Graph Data - Moving Beyond Node Views

Lynn Pausic - Expero

When people talk about visualizing graph data, what typically comes to mind is the canonical node view. Node views typically display nodes (vertex) and the relationships (edge) between them. With large data sets consisting of millions of vertices and edges, node views can quickly become unwieldy to use and comprehend. Traditional UI patterns and visualizations conceived for relational schemas often don’t work with graph data. Relational schemas are predefined and relatively static making it easy to tailor UI navigation to the available data dimensions. Due to the distinct mathematical nature of graph data, traversing data in a graph is fairly different. While this presents additional challenges, there are also opportunities. Traversing a graph with certain algorithms allows you to, for example, show key influencers in social networks, clusters of communities in customer reviews or weak points in electrical grids. These new insights into data provide novel tools to craft the user experience. But this opportunity comes at a price, namely more complexity. Through building and deploying dozens of applications driven by graph data, we’ve developed a unique approach to building UIs driven by graph data and arsenal of data visualizations that work well across broad range of contexts. In this talk we’ll share various tools and examples for displaying graph data in meaningful ways to users.

Knowledge Graphs for AI

Mike Tung - Diffbot

Leveraging structured knowledge, about your organization as well as the outside world, will be a critical ingredient in the design of the next wave of intelligent applications—for everything from new app experiences, to search assistants, to enterprise business intelligence. This talk will give a review of the current open source and commercial knowledge graphs and how consumer and business applications are already taking advantage of these to provide intelligent experiences and enhanced business efficiency.

Distributed ACID with JanusGraph on FoundationDB

Ted Wilmes - Expero

The popular open source JanusGraph property graph database supports a variety of different storage layers, each with their own operating characteristics and constraints, but up until now has not had a distributed ACID option. Earlier this year, Apple announced the open sourcing of FoundationDB, a high performance distributed key-value store with ACID guarantees. Could this be a match made in distributed serializable graph isolation heaven? This talk will explore this question in detail, starting with an overview of FoundationDB, followed by a discussion of why ACID matters in a graph database. We’ll finish with the implementation details of the new, experimental JanusGraph FoundationDB adapter and early performance results.