Blog
The latest from Google Research
Google at KDD’17: Graph Mining and Beyond
Wednesday, August 23, 2017
Posted by Bryan Perozzi, Research Scientist, NYC Algorithms and Optimization Team
The
23rd ACM conference on Knowledge Discovery and Data Mining
(KDD’17), a main venue for academic and industry research in data science, information retrieval, data mining and machine learning, was held last week in Halifax, Canada. Google has historically been an active participant in KDD, and this year was no exception, with Googlers’ contributing numerous papers and participating in workshops.
In addition to our overall participation, we are happy to congratulate fellow Googler Bryan Perozzi for receiving the SIGKDD 2017 Doctoral Dissertation Award, which serves to recognize excellent research by doctoral candidates in the field of data mining and knowledge discovery. This award was given in recognition of his
thesis
on the topic of machine learning on graphs performed at Stony Brook University, under the advisorship of
Steven Skiena
. Part of his thesis was developed during his internships at Google. The thesis dealt with using a restricted set of local graph primitives (such as ego-networks and truncated random walks) to effectively exploit the information around each vertex for
classification
,
clustering
, and
anomaly detection
. Most notably, the work introduced the random-walk paradigm for graph embedding with neural networks in DeepWalk.
DeepWalk: Online Learning of Social Representations
, originally presented at KDD'14, outlines a method for using a series of local information obtained from truncated random walks to learn
latent
representations of nodes in a graph (e.g. users in a social network). The core idea was to treat each segment of a random walk as a sentence “in the language of the graph.” These segments could then be used as input for neural network models to learn representations of the graph’s nodes, using sequence modeling methods like
word2vec
(which had just been developed at the time). This research continues at Google, most recently with
Learning Edge Representations via Low-Rank Asymmetric Projections
.
The full list of Google contributions at KDD’17 is listed below (Googlers highlighted in
blue
).
Organizing Committee
Panel Chair:
Andrew Tomkins
Research Track Program Chair:
Ravi Kumar
Applied Data Science Track Program Chair:
Roberto J. Bayardo
Research Track Program Committee:
Sergei Vassilvitskii
,
Alex Beutel
,
Abhimanyu Das
,
Nan Du
,
Alessandro Epasto
,
Alex Fabrikant
,
Silvio Lattanzi
,
Kristen Lefevre
,
Bryan Perozzi
,
Karthik Raman
,
Steffen Rendle
,
Xiao Yu
Applied Data Science Program Track Committee:
Edith Cohen
,
Ariel Fuxman
,
D. Sculley
,
Isabelle Stanton
,
Martin Zinkevich
,
Amr Ahmed
,
Azin Ashkan
,
Michael Bendersky
,
James Cook
,
Nan Du
,
Balaji Gopalan
,
Samuel Huston
,
Konstantinos Kollias
,
James Kunz
,
Liang Tang
,
Morteza Zadimoghaddam
Awards
Doctoral Dissertation Award:
Bryan Perozzi
, for
Local Modeling of Attributed Graphs: Algorithms and Applications
.
Doctoral Dissertation Runner-up Award:
Alex Beutel
, for
User Behavior Modeling with Large-Scale Graph Analysis
.
Papers
Ego-Splitting Framework: from Non-Overlapping to Overlapping Clusters
Alessandro Epasto
,
Silvio Lattanzi
,
Renato Paes Leme
HyperLogLog Hyperextended: Sketches for Concave Sublinear Frequency Statistics
Edith Cohen
Google Vizier: A Service for Black-Box Optimization
Daniel Golovin
,
Benjamin Solnik
,
Subhodeep Moitra
,
Greg Kochanski
,
John Karro
,
D. Sculley
Quick Access: Building a Smart Experience for Google Drive
Sandeep Tata
,
Alexandrin Popescul
,
Marc Najork
,
Mike Colagrosso
,
Julian Gibbons
,
Alan Green
,
Alexandre Mah
,
Michael Smith
,
Divanshu Garg
,
Cayden Meyer
,
Reuben KanPapers
TFX: A TensorFlow Based Production Scale Machine Learning Platform
Denis Baylor
,
Eric Breck
,
Heng-Tze Cheng
,
Noah Fiedel
,
Chuan Yu Foo
,
Zakaria Haque
,
Salem Haykal
,
Mustafa Ispir
,
Vihan Jain
,
Levent Koc
,
Chiu Yuen Koo
,
Lukasz Lew
,
Clemens Mewald
,
Akshay Modi
,
Neoklis Polyzotis
,
Sukriti Ramesh
,
Sudip Roy
,
Steven Whang
,
Martin Wicke
,
Jarek Wilkiewicz
,
Xin Zhang
,
Martin Zinkevich
Construction of Directed 2K Graphs
Balint Tillman, Athina Markopoulou, Carter T. Butts,
Minas Gjoka
A Practical Algorithm for Solving the Incoherence Problem of Topic Models In Industrial Applications
Amr Ahmed
,
James Long
,
Dan Silva
,
Yuan Wang
Train and Distribute: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
Heng-Tze Cheng
,
Lichan Hong
,
Mustafa Ispir
,
Clemens Mewald
,
Zakaria Haque
,
Illia Polosukhin
,
Georgios Roumpos
,
D Sculley, Jamie Smith
,
David Soergel
,
Yuan Tang,
Philip Tucker
,
Martin Wicke
,
Cassandra Xia
,
Jianwei Xie
Learning to Count Mosquitoes for the Sterile Insect Technique
Yaniv Ovadia
,
Yoni Halpern
,
Dilip Krishnan
, Josh Livni, Daniel Newburger,
Ryan Poplin
, Tiantian Zha,
D. Sculley
Workshops
13th International Workshop on Mining and Learning with Graphs
Keynote Speaker:
Vahab Mirrokni
- Distributed Graph Mining: Theory and Practice
Contributed talks include:
HARP: Hierarchical Representation Learning for Networks
Haochen Chen,
Bryan Perozzi
, Yifan Hu and Steven Skiena
Fairness, Accountability, and Transparency in Machine Learning
Contributed talks include:
Fair Clustering Through Fairlets
Flavio Chierichetti,
Ravi Kumar
,
Silvio Lattanzi
,
Sergei Vassilvitskii
Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations
Alex Beutel
,
Jilin Chen
,
Zhe Zhao
,
Ed H. Chi
Tutorial
TensorFlow
Rajat Monga
,
Martin Wicke
,
Daniel ‘Wolff’ Dobson
,
Joshua Gordon
Labels
accessibility
ACL
ACM
Acoustic Modeling
Adaptive Data Analysis
ads
adsense
adwords
Africa
AI
AI for Social Good
Algorithms
Android
Android Wear
API
App Engine
App Inventor
April Fools
Art
Audio
Augmented Reality
Australia
Automatic Speech Recognition
AutoML
Awards
BigQuery
Cantonese
Chemistry
China
Chrome
Cloud Computing
Collaboration
Compression
Computational Imaging
Computational Photography
Computer Science
Computer Vision
conference
conferences
Conservation
correlate
Course Builder
crowd-sourcing
CVPR
Data Center
Data Discovery
data science
datasets
Deep Learning
DeepDream
DeepMind
distributed systems
Diversity
Earth Engine
economics
Education
Electronic Commerce and Algorithms
electronics
EMEA
EMNLP
Encryption
entities
Entity Salience
Environment
Europe
Exacycle
Expander
Faculty Institute
Faculty Summit
Flu Trends
Fusion Tables
gamification
Gboard
Gmail
Google Accelerated Science
Google Books
Google Brain
Google Cloud Platform
Google Docs
Google Drive
Google Genomics
Google Maps
Google Photos
Google Play Apps
Google Science Fair
Google Sheets
Google Translate
Google Trips
Google Voice Search
Google+
Government
grants
Graph
Graph Mining
Hardware
HCI
Health
High Dynamic Range Imaging
ICCV
ICLR
ICML
ICSE
Image Annotation
Image Classification
Image Processing
Inbox
India
Information Retrieval
internationalization
Internet of Things
Interspeech
IPython
Journalism
jsm
jsm2011
K-12
Kaggle
KDD
Keyboard Input
Klingon
Korean
Labs
Linear Optimization
localization
Low-Light Photography
Machine Hearing
Machine Intelligence
Machine Learning
Machine Perception
Machine Translation
Magenta
MapReduce
market algorithms
Market Research
materials science
Mixed Reality
ML
ML Fairness
MOOC
Moore's Law
Multimodal Learning
NAACL
Natural Language Processing
Natural Language Understanding
Network Management
Networks
Neural Networks
NeurIPS
Nexus
Ngram
NIPS
NLP
On-device Learning
open source
operating systems
Optical Character Recognition
optimization
osdi
osdi10
patents
Peer Review
ph.d. fellowship
PhD Fellowship
PhotoScan
Physics
PiLab
Pixel
Policy
Professional Development
Proposals
Public Data Explorer
publication
Publications
Quantum AI
Quantum Computing
Recommender Systems
Reinforcement Learning
renewable energy
Research
Research Awards
resource optimization
Responsible AI
Robotics
schema.org
Search
search ads
Security and Privacy
Self-Supervised Learning
Semantic Models
Semi-supervised Learning
SIGCOMM
SIGMOD
Site Reliability Engineering
Social Networks
Software
Sound Search
Speech
Speech Recognition
statistics
Structured Data
Style Transfer
Supervised Learning
Systems
TensorBoard
TensorFlow
TPU
Translate
trends
TTS
TV
UI
University Relations
UNIX
Unsupervised Learning
User Experience
video
Video Analysis
Virtual Reality
Vision Research
Visiting Faculty
Visualization
VLDB
Voice Search
Wiki
wikipedia
WWW
Year in Review
YouTube
Archive
2022
Jun
May
Apr
Mar
Feb
Jan
2021
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2020
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2019
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2018
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2017
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Jul
May
Apr
Mar
Feb
2007
Oct
Sep
Aug
Jul
Jun
Feb
2006
Dec
Nov
Sep
Aug
Jul
Jun
Apr
Mar
Feb
Feed
Follow @googleai
Give us feedback in our
Product Forums
.