Aijun An

York University
Project Title: Computational support for big data analytics, information extraction, and visualization
Industry Partner: IBM Canada Ltd. 
Project Partner: Amir Asif
Platform: Cloud Analytics

Project Title: Distributed Deep Learning and Graph Analytics Using IBM Spectrum Computing Solutions
Industry Partner: IBM Canada Ltd. 
Project Partner: Amir Asif
Platform: Cloud Analytics, GPU, LMS

Digital Media Cities

Computational Support for Big Data Analytics, Information Extraction, and Visualization

The Centre for Innovation in Visualization and Data Driven Design (CIVDDD), an Ontario ORF-RE project performs research for which SOSCIP resources are needed and they were awarded NSERC CRD funding with IBM Platform [Applications of IBM Platform Computing solutions for solving Data Analytics and 3D Scalable Video Cloud Transcoder Problems] beginning in July 2015.
Two sub projects of the CIVDDD require immediate multi-core cluster support with IBM Platform Computing:

  1. Efficient Mining of Frequent patterns from Big Data;
  2. Exploration of a Scalable Video Cloud Transcoder for Wireless Multicasts.

Frequent itemset mining is a difficult problem due to the unbounded, high-speed continuous characteristics of streaming data, e.g., large city traffic signals, energy monitoring in home/factory, mobile device vital sign monitoring, etc.

There are some inherent challenges for data stream mining: (1) Each data element can be examined at most once; (2) Although the data elements are continuously generated, memory consumption should be limited; (3) Every incoming data element should be processed as fast as possible; and (4) The analytical result of data stream should be available of acceptable quality when requesting results. Due to the characteristics of data streams, traditional frequent pattern mining algorithms cannot be directly applied.

The goal of the transcoding project is to design an error-resilient transcoding framework for mobile 3D video streaming, which transcodes an HD 3D video stream to a mobile, scalable 3D video stream. The focus is on homogenous transcoding that maps video from one resolution to another, but will later extend to heterogeneous transcoding schemes converting a known video coding standard to another standard suitable for wireless communications. After encoding at the media server, captured video and generated depth data are streamed through the Internet based on a hierarchical representation. When the receiver is a mobile user, the high bit-rate HD 3D video data is transcoded to a low bit-rate mobile 3D video data at the transcoding gateway and is then streamed to the mobile receiver.

All video transcoding schemes are computationally intense especially for real-time video delivery. As a solution, the goal is to develop a “cloud transcoder”, which utilizes an intermediate cloud platform to bridge the format/resolution “gap” by performing video transcoding in the cloud. The infrastructure provided by IBM Platform will be used to implement and test the proposed 3D scalable video cloud transcoder for wire-less Multicasts. To speed-up computational, the transcoder implementation will be parallelized to execute multiple threads and made concurrent for completing independent threads in overlapping time periods.

Distributed Deep Learning and Graph Analytics Using IBM Spectrum Computing Solutions

Deep learning is a popular machine learning technique and has been applied to many real-world problems, ranging from computer vision to natural language processing. In most cases deep learning outperformed previous work. However, training a deep neural network is very time-consuming, especially on big data. A popular solution is to distribute and parallel the training process across multiple machines.

Indeed, the race is on to parallelize deep learning! Industry and academic research teams around the world are trying to make deep neural networks train as fast as possible on farms of GPU capable servers. We are working with our IBM partners to help develop advanced scheduling and messaging techniques for distributed deep learning.

In addition, we will develop two real-world applications of distributed deep learning to demonstrate the efficiency and effectiveness of distributed deep learning. In one application, we address the video surveillance problem of tracking a moving target over a network of video cameras with partial or no overlaps in their coverage. We will use a deep learning approach to identify multiple pedestrians in each video frame, and a particle filter to track moving pedestrians. In the second application, we address the problem of fraud/intrusion detection. We will use graph-based detection that considers relationships between objects or individuals. Graph-based approaches are powerful because they do not operate on objects or individuals in isolation, but also consider their network information. We will emphasize on graph-based fraud detection methods that have a number of applications and potentially large impacts.