Confirmed Sessions at Graph Day SF

We are now beginning to announce the confirmed sessions. Check this page regularly for updates.

Project Konigsburg - A GraphAI

Denis Vrdoljak / Danny Wudka - Berkeley Data Science Group

In this presentation, we will talk about our research and development towards creating an AI that can predict connections within graph networks. Unlike typical prediction methods based on counting wedges (e.g., counting “mutual friends”) or requiring outside knowledge (e.g., syncing with email or contacts lists), we will talk about how we employed different triadic measurements to engineer features for our machine learning models to predict connections based on relationship patterns, specific to different applications. We will also go over some of the applications we have in mind for our system – including recommending stores and restaurants based on social connections’ shopping patterns, predicting future social or professional contacts, and even possible applications in counterintelligence and counterterrorism.
We will cover some of the challenges that we faced, like biased uncertainty in training data, single classifier approaches, and limitations of existing graph databases , and adapting heuristics based on application – after all, we don’t expect recommending coffee-shops to work with the same parameters as identifying sleeper cells!
Finally, we will review the different machine learning models that we evaluated, talk about their trade-offs, and conclude with a brief demo of our system in action, and talk about some of the new developments and possibilities we learned at Data Day Texas earlier this year.

Graph-based Taxonomy Generation

Rob McDaniel - Live Stories

How do you automatically generate a taxonomy from a corpus? Getting topics is easy, but organizing them into any meaningful hierarchy is expensive. This talk will cover the real-world application of graph-based taxonomy generation from a weighted topic graph, as proposed by Treeratpituk et al. Included in this lecture will be a brief overview of multi-level graph partitioning, the generation of an edge and vertex weighted graph and a basic open-source implementation and samples.

Graphs in Genomics

Jason Chin - Pacific Biosciences

Since the discovery of DNA molecules, graph theory and methods have been used in analyzing genomes. Recently progress in high throughput DNA sequencing instrument development has pushed the state of art using graph for understanding genomics further. Jason will present recent advances in this field to the data science community.
Jason will begin by going over a few examples where graphs are used to encode genomics information for human health. He will then dive a bit into the graph theory used for a specific problem: "genome assembly" - essentially, how currently bioinformatists use graphs to put millions of smaller pieces of DNA sequences (hundred gigabyte data) into contiguous genome sequences (several Mb to several GB) in practice.
Jason will 1) define the problem, 2) give an overview of the general approach, 3) compare different topological and statistical properties of assembly graph to other kinds of graphs, e.g., social network or small world graph, and 4) demonstrate a specific end-to-end example for people to see the whole process. Jason will wrap up the talk with a view toward future challenges: computation scaling, new related theoretical problems and standardization for related graph processing.

Investigating patterns of human trafficking through graph visualization

Christian Miles - Cambridge Intelligence

It is estimated that at any given time, 2.5 million people are in forced labour (including sexual exploitation) as a result of trafficking. The vast majority of victims are between 18 and 24 years of age. In this talk, Christian will walk the audience through the steps taken to collect, visualize and analyze a unique dataset of 22,500 classified advertisements in order to identify potential indicators of human trafficking. This talk and associated work follows a similar methodology for analysis as previous studies completed in Hawaii but applies a distinctly graph-oriented approach to a whole new geography. Methods used will include graph modelling/visualization, web scraping, text mining and geospatial analysis. Christian will demonstrate how his analysis reveals valuable new insights into trafficking routes and highlights patterns of exploitation that could be used to prevent trafficking crimes.