- AVS3 Video Coding Standard
- Workshop on ICT and Multimedia Tools for Migrants Inclusion in Host societies (WIMMIH2020)
- The 1st International Workshop on Interactive Multimedia Retrieval
- Tools for Creating XR Media Experiences
- Multimedia Services and Technologies for Smart-Health (MUST-SH 2020)
- 3D Point Cloud Processing, Analysis, Compression, and Communication (PC-PACC)
- The 1st ICME Workshop on Hyper-Realistic Multimedia for Enhanced Quality of Experience
- The 7th IEEE International Workshop on Mobile Multimedia Computing (MMC 2020)
- IEEE International Workshop of Artificial Intelligence in Sports (AI-Sports)
- The 2nd International Workshop on Big Surveillance Data Analysis and Processing (BIG-Surv)
- Data-driven Just Noticeable Difference for Multimedia Communication
- Media-Rich Fake News (MedFake)
Workshop Chairs
- Noel O’Connor, Dublin City University, Ireland
- Raouf Hamzaoui, De Montfort University, UK
Organisers:
- Siwei Ma, Peking University, China
- Lu Yu, Zhejiang University, China
- Xiaozhen Zheng, DJI, China
- Li Zhang, Bytedance, USA
- Shan Liu, Tencent America, USA
Description:
AVS3 is the latest video coding standard developed by China AVS workgroup, targeting the emerging 4K/8K and VR applications. Till now, AVS3 has adopted many new efficient video coding tools, such as extended quad-tree block partitions, boundary filter for intra prediction, and flexible reference picture list management scheme. AVS3 shows significant coding gain over the previous video coding standards. Recently, Hisilicon has announced the first AVS3 8K@120p decoder chip at IBC2019. Moreover, AVS3 also did much exploration work in deep learning-based compression, where both piecemeal and end-to-end approaches were studied.
This workshop aims at bring together academic researchers, industrial practitioners, and individuals working on this emerging exciting research area to disseminate their new ideas, latest findings, and state-of-the-art results related to AVS3 developments.
Scope and Topics:
Topics of interest include, but are not limited to:
- Coding tools
- Software/hardware implementations
- System transport
- Quality evaluation
- Learning based image/video compression
Schedule:
Monday, July 6 – London (BST time zone)
Session 1
Session Chairs: Siwei Ma (Peking University) and Xiaozhen Zheng (DJI)
Start | End | Paper | Authors |
---|---|---|---|
09:00 | 09:20 | Performance Evaluation For AVS3 Video Coding Standard | Xiaozhen Zheng1,2, Qingmin Liao1, Yueming Wang2, Ze Guo2, Jianglin Wang2, Yan Zhou2 (1Tsinghua Shenzhen International Graduate School, 2SZ DJI Technology Co., Ltd) |
09:20 | 09:40 | Performance And Computational Complexity Analysis Of Coding Tools In Avs3 | Kui Fan1, Yangang Cai1, Xuesong Gao2, Weiqiang Chen2, Shengyuan Wu1, Zhenyu Wang1, Ronggang Wang1, Wen Gao1 (1Peking University Shenzhen Graduate School, 2Hisense Co., Ltd.) |
09:40 | 10:00 | Prediction With Multi-Cross Component | Junru Li1 Li Zhang2, Kai Zhang2, Hongbin Liu2, Meng Wang3 Shiqi Wang3 Siwei Ma1, Wen Gao1 (1Peking University, 2Bytedance Inc., 3City University of Hong Kong) |
10:00 | 10:20 | Adaptive Motion Vector Resolution In AVS3 Standard | Chuan Zhou1, Zhuoyi Lv1, Yinji Piao1, Yue Wu1, Kiho Choi2, Kwang Pyo Choi2 (1Samsung Research China, 2Samsung Electronics) |
10:20 | 10:40 | Coffee Break | |
10:40 | 11:00 | Affine Direct/Skip Mode With Motion Vector Differences In Video Coding | Tianliang Fu1, Kai Zhang2, Hongbin Liu,2 Li Zhang2, Shanshe Wang1, Siwei Ma1, Wen Gao1 (1Peking University, 2Bytedance Inc.) |
11:00 | 11:20 | Implicit-Selected Transform In Video Coding | Yuhuai Zhang1, Kai Zhang2, Li Zhang2, Hongbin Liu2, Yue Wang2, Shanshe Wang1, Siwei Ma1, Wen Gao1 (1Peking University, 2Bytedance Inc.) |
11:20 | 11:40 | Scan Region-Based Coefficient Coding In AVS3 | Zhuoyi Lv1, Yinji Piao2, Yue Wu1, Kiho Choi2, Kwang Pyo Choi2 (1Samsung Research China, 2Samsung Electronics) |
11:40 | 12:00 | Intra Block Copy In AVS3 Video Coding Standard | Yingbin Wang, Xiaozhong Xu, Shan Liu (Tencent) |
12:00 | 12:20 | History Based Block Vector Predictor For Intra Block Copy | Wenbin Yin1, Jizheng Xu2, Li Zhang2, Kai Zhang2, Hongbin Liu2, Xiaopeng Fan1 (1Harbin Institute of Technology, 2Bytedance Inc.) |
Session 2
Session Chairs: Siwei Ma (Peking University) and Li Zhang (Bytedance Inc.)
Start | End | Paper | Authors |
---|---|---|---|
14:00 | 14:20 | UAVS3D - Fast Multi-Platform And Open Source Decoder For AVS3 | Jiang Du, Zhenyu Wang, Bingjie Han, Ronggang Wang (Shenzhen Graduate School, Peking University) |
14:20 | 14:40 | GPU Based Real-Time Uhd Intra Decoding For AVS3 | Xu Han1, Bo Jiang2, Shanshe Wang2, Lin Li3, Yi Su3, Siwei Ma2, Wen Gao2 (1Shanghai Jiao Tong University, 2 Peking University, 3 MIGU Co.,Ltd) |
14:40 | 15:00 | Inheritability-Inspired Intra Coding Optimization For AVS3 | Jiaqi Zhang1, Xuewei Meng1, Chuanmin Jia1, Yi Su2, Song Xu2, Shanshe Wang1, Siwei Ma1, Wen Gao1 (1Peking University, 2MIGU Co.,Ltd) |
15:00 | 15:20 | Coffee Break | |
15:20 | 15:40 | Residual Convolutional Neural Network Based In-Loop Filter With Intra And Inter Frames Processed Respectively For AVS3 | Han Zhu, Xiaozhong Xu, Shan Liu (Tencent) |
15:40 | 16:00 | Deep Learning Based Intra Prediction Filter In AVS3 | Chentian Sun, Xiaopeng Fan, Debin Zhao (Harbin Institute of Technology) |
16:00 | 16:20 | CNN-Based Inter Prediction Refinement For AVS3 | Zhicong Zhang1, Xiaopeng Fan1, Debin Zhao1, Wen Gao 2 (1Harbin Institute of Technology, 2Peking University) |
Organisers:
- Dr Petros Daras, Centre for Research and Technology Hellas (CERTH), Greece
- Dr Nicholas Vretos, Centre for Research and Technology Hellas (CERTH), Greece
- Prof. Federico Alvarez, Universidad Politecnica de Madrid (UPM), Spain
- Dr Theodoros Semertzidis, Centre for Research and Technology Hellas (CERTH), Greece
- Prof. Yuri Adrian Tijerino, Kwansei Gakuin University, Japan
Description:
Migrants integration into host societies pose many challenges in different levels. From job seeking to education enrolling and from asylum seeker to vulnerable refugees, the spectrum of different tools that can be created to assist these people and host authorities is vast. Multimedia ICT solutions have been devised in order to cope with many of these issues. Artificial Intelligence (AI) and Machine Learning (ML) tools have been used thus far to assist migrants and host authorities to provide better services for the benefit of both migrants and host societies. It is evident that migration flows rise due to regional conflicts and/or environmental conditions shifting and that new tools need to be researched and develop to assist societies towards a smooth integration of these people to the host societies. In this volatile, intercultural, landscape and with the need to support many different languages, illiteracy problems, and lack of technology skills, multimedia approaches that reduce the need of written communication appear to be the most effective ones.
The aim of this workshop is to call for a coordinated effort to understand the scenarios and the challenges emerging in ICT solutions for migrants’ inclusion into host societies with AI and ML based Multimedia tools, identify the key tasks and evaluate the current state of the art in the specific domain. Moreover, this workshop will try to showcase innovative ideas in the area that aid on the smooth integration of migrants into host societies and discuss further directions. We solicit manuscripts in all fields that explore the synergies of Multimedia tools and AI, ML, towards the assistance of migrants and host authorities for a smooth inclusion of the former in a host society.
Scope and Topics:
We believe the workshop will offer a timely collection of research updates to benefit the researchers and practitioners working in the broad fields ranging from computer vision, artificial intelligence and machine learning, with emphasis on multimedia related solutions. To this end, we solicit original research and survey papers addressing the topics listed below (but not limited to):
- AI technologies for multimedia game-based skill assessment;
- AI technologies for video and image-based migration flow analysis;
- AI technologies for skill-job matching;
- AI technologies for video and image-based migration flow prediction;
- AI technologies for automatic administrative multimedia-based assistance;
- AI technologies for multimedia based intercultural communication assistance;
- Data analytics and demo systems for large scale job seeking services;
- Migration related multimedia datasets and evaluation protocols;
- AI-assisted or human-AI co-operated technologies for administrative multimedia-based assistance;
- Emerging new applications in Multimedia ICT Tools for Migrant Inclusion in host societies
Schedule:
Monday, July 6 – London (BST time zone)
Session Chair: Dr. Nicholas Vretos (Centre for Research and Technology Hellas)
Start | End | Paper | Author |
---|---|---|---|
11:00 | 11:20 | Immerse: A Personalized System Addressing The Challenges Of Migrant Integration | Dimos Ntioudis1, Eleni Kamateri1, Georgios Meditskos1, Anastasios Karakostas1, Florian Huber2, Romeo Bratska3, Stefanos Vrochidis1, Babak Akhgar4, Ioannis Kompatsiaris1 (1Information Technologies Institute - Centre for Research and Technology Hellas, 2SYNYO GmbH, 3ADITESS, 4Sheffield Hallam University) |
11:20 | 11:40 | Nadine-Bot: An Open Domain Migrant Integration Administrative Agent | Athanasios Lelis, Nicholas Vretos, Petros Daras (Information Technologies Institute, Centre for Research and Technology Hellas) |
11:40 | 12:00 | A Novel Multi-Modal Framework For Migrants Integration Based On AI Tools And Digital Companions | David Martın-Gutierrez1, Gustavo Hernandez-Penaloza1, Theodoros Semertzidis2, Francisco Moreno1, Michalis Lazaridis2, Federico Alvarez1, Petros Daras2 (1Universidad Politecnica de Madrid, 2Centre for Research and Technology Hellas) |
12:00 | 12:10 | Coffee Break | |
12:10 | 12:30 | Embracing Novel ICT Technologies To Support The Journey From Camp To Job | Helen C. Leligou1, Despina Anastosopoulos1, Anita Montagna2, Vassilis Solachidis3, Nicholas Vretos3 (1INTRASOFT International, 2Information Technologies Institute, Center for Research and Technology Hellas, 3Centro Studi Pluriversum) |
Organisers:
- Werner Bailer, Joanneum Research, Austria
- Klaus Schoeffmann, Klagenfurt University, Austria
- Luca Rossetto, University of Zurich, Switzerland
- Jakub Lokoč, Charles University, Czech Republic
Description:
With the recent increase in both volume and diversity of multimedia data, effective browsing and retrieval methods become increasingly important in order to deal with the available data and find the relevant documents. While this problem is well understood for textual documents, where an information need can often be expressed in sufficient detail with a textual query, the effective search in multimedia documents is generally more difficult.
The 1st International Workshop on Interactive Multimedia Retrieval calls for submissions related to interactive retrieval in and across all types of multimedia content.
Scope and Topics:
We invite submissions reporting on current work done in the context of e.g., the Video Browser Showdown or the Lifelog Search Challenge, as well as interactive variants of solutions to TRECVID, MediaEval or similar tasks. Submissions should describe methods, but also insights and lessons learned from participating in such benchmarks. In this context, contributions related (but not limited) to the following topics are invited:
- Interactive Retrieval Approaches and Methods
- Browsing and Interactive Search User Interfaces
- Multi-User Search, Retrieval and Browsing
- Understanding User Behaviour and Information Needs
- Cross/Multi-Modal Retrieval Methods
- Datasets, Evaluation Metrics and Protocols
- Multimedia Indexing Methods
- Video Summarization Methods
- Interactive Multimedia System Design and Architecture
Schedule:
Monday, July 6 – London (BST time zone)
Session Chair: Werner Bailer (Joanneum Research)
Start | End | Paper | Authors |
---|---|---|---|
11:00 | 11:20 | Deep Learning Classification With Noisy Labels | Guillaume Sanchez, Vincente Guis, Ricard Marxer, Frédéric Bouchara (University of Toulon) |
11:20 | 11:40 | A Text-Guided Graph Structure For Image Captioning | Depeng Wang, Zhenzhen Hu, Yuanen Zhou, Xueliang Liu, Le Wu, Richang Hong (Hefei University of Technology) |
11:40 | 12:00 | Deep Semantic Adversarial Hashing Based On Autoencoder For Large-Scale Cross | Mingyong Li, Hongya Wang (Donghua university) |
12:00 | 12:20 | Multi-Stage Queries And Temporal Scoring In Vitrivr | Silvan Heller1, Loris Sauter1, Heiko Schuldt1, Luca Rossetto2 (1University of Basel, 2University of Zurich) |
Organisers:
- Hannes Fassold, Joanneum Research, Austria
- Dimitrios Zarpalas, Centre for Research & Technology, Greece
- Pablo Cesar, Centrum Wiskunde & Informatica and Delft University of Technology, Netherlands
- Mario Montagud, i2CAT & University of Valencia, Spain
Description:
Extended Reality (XR), which includes Virtual Reality (VR), Augmented Reality (AR) and mixed reality (MR), creates entirely new ways for consumers to experience the world around them and interact with it. Within the last few years, improvements in sensor technology and processing power have led to tremendous advances in all aspects of XR hardware, and due to economies of scale of the massively growing XR market these devices are available now at a reasonable price point. On the production side, powerful low-cost systems for capturing 3D objects and volumetric video and 360° videos make it possible to create budget VR/AR productions. The same applies to the consumption side, where VR headsets like the Oculus Go or Playstation VR provide a highly immersive VR experience which is affordable for everyone.
Unfortunately, the development of tools and technologies for authoring, processing and delivering interactive XR experiences is lagging considerably behind the hardware development, which is definitely a hurdle for the cost-effective mass production of appealing XR content and scenarios. Lack of content in turn hinders broader adoption and acceptance of XR technologies by the consumer. For all these aspects, new approaches and technologies are needed in order to overcome the specific challenges of XR content creation (multimodal data, non-linear interactive storytelling, annotation and metadata models, novel compression techniques, bandwidth requirements, etc.).
This workshop asks for original contributions on new approaches, technologies and tools for creating, processing and delivering interactive XR media (3D/CGI content/point clouds, 360° video, 3DoF+/6DoF video, volumetric video, spatial audio…).
Please click here for further information.
Scope and Topics:
Topics of particular interest include, but are not limited to:
- Efficient XR content acquisition and representation.
- Compression and delivery to various platforms (HMD, smartphones, SmartTV / HbbTV, Web, …)
- Subjective and objective assessment of XR scenarios (content quality, experiences…).
- Semantic understanding of XR content (depth estimation, semantic segmentation, object recognition, pose estimation, action recognition, audio analysis, etc.).
- Automating the XR content authoring process (e.g. providing automatic content annotation / storytelling)
- Authoring interactions and navigation aids (e.g., elements for moving in time and space, avatars)
- Authoring accessible XR experiences (e.g. subtitles, audio description, audio subtitling, sign language, …)
Keynote
Volumetric Video Content Creation for Immersive AR/VR Experiences
Speaker:
Prof. Aljosa Smolic, Trinity College Dublin
Abstract:
Volumetric video (VV) is an emergent digital media that enables novel forms of interaction and immersion within virtual worlds. VV allows 3D representation of real-world scenes and objects to be visualized from any viewpoint or viewing direction; an interaction paradigm that is commonly seen in computer games. Based on this innovative media format, it is possible to design new forms of immersive and interactive experiences that can be visualized via head-mounted displays (HMDs) in virtual reality (VR) or augmented reality (AR). The talk will highlight technology for VV content creation developed by the V-SENSE lab and the startup Volograms. It will further showcase a variety of creative experiments applying VV for immersive storytelling in VR and AR.
Bio:
Prof. Smolic is the SFI Research Professor of Creative Technologies at Trinity College Dublin (TCD). Before joining TCD, Prof. Smolic was with Disney Research Zurich as Senior Research Scientist and Head of the Advanced Video Technology group, and with the Fraunhofer Heinrich-Hertz-Institut (HHI), Berlin, also heading a research group as Scientific Project Manager. At Disney Research he led over 50 R&D projects in the area of visual computing that have resulted in numerous publications and patents, as well as technology transfers to a range of Disney business units. Prof. Smolic served as Associate Editor of the IEEE Transactions on Image Processing and the Signal Processing: Image Communication journal. He was Guest Editor for the Proceedings of the IEEE, IEEE Transactions on CSVT, IEEE Signal Processing Magazine, and other scientific journals. His research group at TCD, V-SENSE, is on visual computing, combining computer vision, computer graphics and media technology, to extend the dimensions of visual sensation. This includes immersive technologies such as AR, VR, volumetric video, 360/omni-directional video, light-fields, and VFX/animation, with a special focus on deep learning in visual computing.
Schedule:
Monday, July 6 – London (BST time zone)
Keynote
Session Chair: Pablo Cesar (Centrum Wiskunde & Informatica and Delft University of Technology)
Start | End | Talk | Speaker |
---|---|---|---|
14:00 | 14:05 | Welcome Message from the Workshop Organizers | |
14:05 | 15:00 | Keynote: Volumetric Video Content Creation for Immersive AR/VR Experiences | Professor Aljosa Smolic (Trinity College Dublin) |
15:00 | 15:10 | Break |
Session 1
Session Chair: Mario Montagud (i2CAT & University of Valencia)
Start | End | Paper | Author |
---|---|---|---|
15:10 | 15:30 | XR360: A Toolkit For Mixed 360 And 3D Productions | Antonis Karakottas, Nikolaos Zioulis, Alexandros Doumanglou, Vladimiros Sterzentsenko, Vasileios Gkitsas, Dimitrios Zarpalas, Petros Daras (Centre for Research and Technology Hellas) |
15:30 | 15:50 | An Authoring Model For Interactive 360 Videos | Paulo R. C. Mendes1, Alan L. V. Guedes1, Daniel de S. Moraes1, Roberto G. A. Azevedo2, Sergio Colcher1 (1Pontifical Catholic University of Rio de Janeiro, 2Ecole Polytechnique Federale de Lausanne) |
15:50 | 16:10 | Towards Neural AR: Unsupervised Object Segmentation With 3D Scanned Model Through ReLaTIVE | Zackary P. T. Sin, Peter H. F. Ng, Hong Va Leong (Hong Kong Polytechnic University) |
16:10 | 16:20 | Break |
Session 2
Session Chair: Antonis Karakottas (Centre for Research & Technology Hellas)
Start | End | Paper / Talk | Author / Speaker |
---|---|---|---|
16:20 | 16:40 | Simplifying The Process Of Creating Augmented Outdoor Scenes | Ribin Chalumattu, Simone Schaub-Meyer, Robin Wiethuchter, Severin Klingler, Markus Gross (ETH Zurich) |
16:40 | 17:00 | Interactive 360 Narrative For TV Use | Christian Fuhrhop1, Louay Bassbouss1, Nico Patz2 (1Fraunhofer FOKUS, 2RBB) |
17:00 | 17:20 | Invited Talk: XR4ALL Project: "XR4ALL - Moving The European XR Tech Industry Forward" | Leen Segers (Lucidweb) |
Organisers:
- M. Shamim Hossain, King Saud University, KSA
- Stefan Goebel, KOM, TU Darmstadt, Germany
Steering Committee:
- Changsheng Xu, Multimedia Computing Group, Chinese Academy of Sciences, China (co-chair)
- Abdulmotaleb El Saddik, University of Ottawa, Ottawa, Canada (co-chair)
- Min, Chen, Huazhong University of Science and Technology (HUST), China
- Mohsen Guizani, EiC IEEE Network
- Athanasios Vasilakos, Lulea University of Technology, Sweden
Technical Chair:
- Susan Malaika, IBM, USA
- Md. Abdur Rahman, UPM, Saudi Arabia
Program Committee:
- Robert Istepanian, Kingston University, UK
- Zheng Chang, University of Jyväskylä, Finland
- Min, Chen, Huazhong HUST, China
- Athanasios Vasilakos, Lulea University of Technology, Sweden
- Tom Baranowski, Baylor College of Medicine, USA
- Stefan Goebel, Multimedia Communications Lab (KOM), TU Darmstadt, Germany
- Yin Zhang, Zhongnan University of Economics and Law, China
- Syed M. Rahman, University of Hawai, USA
- Biao Song, Kyung Hee University, South Korea
- Mukaddim Pathan, Australian National University, Australia
- Gamhewage Chaminda de Silva, University of Tokyo, Japan
- Kaoru Sezaki, University of Tokyo, Japan
- Manzur Morshed, Monash University, Australia
- Edward Delp, Purdue University, USA
- Majdi Rawashdeh, New York University, UAE
- Muhammad Ghulam, CCIS, King Saud University, KSA
- Abdur Rahman, SITE, University of Ottawa, Canada
- Al-Sakib Khan Pathan, IIUM, Malaysia
- Jorge Parra, Ikerlan-IK4, Spain
- Nada Philip, Kingston University, UK
- Md. Mehedi Masud, Taif University, KSA
- Mehedi Hassan, Kyung Hee University, South Korea
- Atif Shamim, King Abdullah University Of Science & Technology, KSA
- Josef Wiemeyer, TU Darmstadt, Germany
- Lennart Nacke, University of Saskatchewan, Canada
- Anders Drachen, AGORA Informatics, Denmark
- Georgios Yannakakis, IT University of Kopenhagen, Denmark
- Simon McCallum, Gjøvik University College, Hedmark, Norway
Description:
Today multimedia services and technologies play an important role in providing and managing smart healthcare services to anyone, anywhere and anytime seamlessly. These services and technologies facilitate doctors and other health care professionals to have immediate access to smart-health information for efficient decision making as well as better treatment. Researchers are working in developing various multimedia tools, techniques and services to better support smart -health initiatives. In particular, works in smart-health record management, elderly health monitoring, real-time access of medical images and video are of great interest.
Scope and Topics:
This workshop aims to report high quality research on recent advances in various aspects of smart health, more specifically to the state-of-the-art approaches, methodologies and systems in the design, development, deployment and innovative use of multimedia services, tools and technologies for smart health care. Authors are solicited to submit complete unpublished papers in the following, but not limited to the following topics of interest:
- Edge-Cloud for Smart Healthcare
- Deep learning approach for smart healthcare
- Explainable artificial intelligence (AI) technology for secured smart healthcare
- Serious Games for health
- Multimedia big data for health care applications
- Adaptive exergames for health
- Fuzzy Logic Approach for smart healthcare monitoring
- Multimedia Enhanced Learning, Training & Simulation for Smart Health
- Sensor and RFID technologies for Smart health
- Cloud-based smart health Services
- Resource allocation for Media Cloud-assisted health care
- IoT-Cloud for Smart Healthcare
- Wearable health monitoring
- Smart health service management
- Context-aware Smart -Health services and applications
- Elderly health monitoring
- Collaborative Smart Health
- Haptics for Surgical/medical Systems
- 5G Tactile Internet for Smart Health
Keynote
Self-Explainable Artificial Intelligence, IoT, B5G, and Blockchain Based Digital Twin: Multimedia Supported Personalized Healthcare During Pandemic
Speaker:
Dr. Md. Abdur Rahman, University of Prince Muqrin
Abstract:
COVID-19 pandemic has shown the weaknesses of our existing healthcare systems, as a city, state, country, and global village. Due to the lack of availability of vaccine and the fact that the pathogen transmits from human-to-human, it has affected the whole world. In order to flatten the curve, the healthcare providers have resorted to traditional clinical solutions, which does not scale to mass level. Thanks to the recent advancements in Multimedia Healthcare technologies in the areas such as Self-Explainable Artificial Intelligence, Blockchain, IoT, and Beyond 5G, to name a few, researchers have shown that multimedia can play a key role in managing the digital twin of each individual during the pandemic. In this keynote talk, I will present 25 different domains of multimedia supported healthcare solutions that has contributed to the COVID-19 pandemic management. Finally, I will share some recommendations regarding the way forward.
Bio:
Dr. Md. Abdur Rahman is an Associate Professor and former Chairman of the Department of Cyber Security and Forensic Computing, College of Computer and Cyber Sciences, University of Prince Muqrin (UPM), Madinah Al Munawwarah, Kingdom of Saudi Arabia. Dr. Rahman is currently serving as the Director of the Research and Postgraduate Studies Department at UPM. In 2018 and 2019, Dr. Rahman has received BEST Researcher Award from UPM. His research interests include Blockchain and Off-chain solutions for Multimedia Applications, Multimedia Security for Mass Crowd Applications, Self-Explainable AI for Multimedia Health Applications, Cyber Security for Cyber Physical Multimedia Systems, Secure Serious Games, Security in Cloud, Fog, and Edge, Adversarial Attacks and Defense Mechanisms of Deep Learning Systems, Secure Machine Learning for ITS, Multimedia security for Healthcare Applications, IoT and 5G Security, Secure Smart City Services, Secure Ambient Intelligent Systems, Spatio-Temporal Multimedia Big Data Security, and Next Generation Media Security. He has authored more than 125 publications. He has 1 US patent granted and several are pending. Dr. A. Rahman has received more than 5 million USD as research grant. Dr. A. Rahman is the founding director of Smart City Research Laboratory, Advanced Media Laboratory and the Science and Technology Unit at UPM. Recently, he received three best paper awards from ACM and IEEE Conferences. Dr. Abdur Rahman has published in top tier journals such as IEEE Networks, IEEE Communications Magazine, IEEE Internet of Things, Elsevier Future Generation Computer System, IEEE Access, IEEE Transactions on Instrumentation and Measurement, IEEE Sensors, to name a few. Dr. Mohamed Abdur Rahman is an external Honorary Fellow of CODA Research Centre, King’s College London, UK. Dr. Abdur Rahman is a member of ACM and a senior member of IEEE.
Schedule:
Monday, July 6 – London (BST time zone)
Session Chair: M. Shamim Hossain (King Saud University)
Start | End | Paper / Talk | Author / Speaker |
---|---|---|---|
14:00 | 14:05 | Opening Remarks | |
14:05 | 15:00 | Keynote: Self-Explainable Artificial Intelligence, IoT, B5G, and Blockchain Based Digital Twin: Multimedia Supported Personalized Healthcare During Pandemic | Md. Abdur Rahman (University of Prince Mugrin) |
15:00 | 15:10 | Break | |
15:10 | 15:30 | Data Driven Patient-Specialized Neural Networks For Blood Glucose Prediction | Alessandro Aliberti1, Andrea Bagatin1, Andrea Acquaviva2, Enrico Macii1, Edoardo Patti1 (1Politecnico di Torino, 2Universit di Bologna) |
15:30 | 15:50 | Resource Allocation Management In Patient-To-Physician (P2P) Communications Based On Deep Reinforcement Learning In Smart Healthcare | Abduhameed Alelaiwi (King Saud University) |
15:50 | 16:10 | Architecture Of Smart Health Care System Using Artificial Intelligence | M. M. Kamruzzaman (Jouf University) |
16:10 | 16:20 | Break | |
16:20 | 16:40 | Automated Grey And White Matter Segmentation In Digitized A Beta Human Brain Tissue Slide Images | Zhengfeng Lai1, Runlin Guo1, Wenda Xu1, Zin Hu1, Kelsey Mifflin1, Brittany Dugger1, Chen-Nee Chuah 1, Sen-ching Cheung2 (1University of California, 2University of Kentucky) |
16:40 | 17:00 | Multi-CNN Feature Fusion For Efficient EEG Classification | Syed Umar Amin, Ghulam Muhammad, Wadood Abdul, Mohamed Bencherif, Mansour Alsulaiman (King Saud University) |
Organisers:
- Hui Yuan, Shandong University, China
- Huanqiang Zeng, Huaqiao University, China
- Philip A. Chou, Google, USA
- Pascal Frossard, EPFL, Switzerland
Description:
The trend over the past decade towards computational imaging has enabled vast amounts of 3D data to be sensed using collections of sensors. At the same time, new types of displays have made it possible to view these 3D data in increasingly natural ways. This combination of trends is giving rise to the next generation of media beyond images, audio, and video: immersive media. Immersive media can be represented in various ways. One representation in particular – 3D point clouds – is becoming increasingly popular, in part because many of the computational imaging systems that capture immersive media are fundamentally digital systems that sample the natural world at discrete 3D points. The signals sampled at these points become attributes of the points, for example color, reflectance, transparency, normal direction, motion direction, and so forth.
The purpose of this workshop is to promote further research and understanding of 3D point clouds and their processing, analysis, compression, and communication, by providing a venue for the exchange and discussion of recent results.
Scope and Topics:
The technical issues covered by this workshop include, but are not limited to:
- Efficient compression for 3D point clouds, e.g., novel prediction technologies, transform methods, rate-distortion optimization methods, etc.,
- 3D point cloud processing based on modern signal processing theory, e.g., graph signal processing,
- 3D point cloud-based computer vision tasks, e. g., visual tracking, object detection, semantic segmentation, and recognition,
- High-reliability and low-delay transmission management optimization for 3D point cloud transmission, and
- Artificial neural network-based 3D point cloud analysis.
Schedule:
Monday, July 6 – London (BST time zone)
Session Chair: Hui Yuan (Shandong University)
Start | End | Paper | Author |
---|---|---|---|
14:00 | 14:10 | PC-PACC Workshop Opening Ceremony | |
14:10 | 14:30 | Deep Learning-Based Point Cloud Geometry Coding: RD Control Through Implicit And Explicit Quantization | Andre F. R. Guarda1, Nuno M. M. Rodrigues2, Fernando Pereira1 (1Instituto de Telecomunicações, 2Instituto Politecnico de Leiria) |
14:30 | 14:50 | Point Cloud Normal Estimation With Graph-Convolutional Neural Networks | Francesca Pistilli, Giulia Fracastoro, Diego Valsesia, Enrico Magli (Politecnico di Torino) |
14:50 | 15:10 | High-Resolution Point Cloud Reconstruction From A Single Image By Redescription | Tianshi Wang, Li Liu, Huaxiang Zhang, Jiande Sun (Shandong Normal University) |
15:10 | 15:20 | Coffee Break | |
15:20 | 15:40 | Coarse To Fine Rate Control For Region-Based 3d Point Cloud Compression | Qi Liu1, Hui Yuan1, Raouf Hamzaoui2, Honglei Su3 (1Shandong University, 2De Montfort University, 3Qingdao University) |
15:40 | 16:00 | Weighted Attribute Prediction Based On Morton Code For Point Cloud Compression | Lei Wei1, Shuai Wan1, Zexing Sun2, Xiaobin Ding1, Wei Zhang2 (1Northwestern Polytechnical University, 2Xidian University) |
Organisers:
- Frédéric Dufaux, CNRS, France
- Homer Chen, National Taiwan University, Taiwan
- Ivan V. Bajić , Simon Fraser University, Canada
- Søren Forchhammer, Technical University of Denmark, Denmark
- Xiaolin Wu, McMaster University, Canada
Technical Programme Committee:
- Anthony Vetro, MERL, USA
- Atanas Gotchev, Tampere University, Finland
- Dong Tian, InterDigital, USA
- Fernando Pereira, Instituto Superior Técnico, Portugal
- Jiaying Liu, Peking University, China
- Joachim Keinert, IIS Fraunhofer, Germany
- Mylene Farias, University of Brasília, Brasil
- Patrick le Callet, University of Nantes, France
- Peter Schelkens, VUB, Belgium
- Rafal Mantiuk, University of Cambridge, UK
- Sanghoon Lee, Yonsei University, Korea
- Søren Bech, Bang & Olufsen, Denmark
- Yonggang Wen, Nanyang Technological University, Singapore
Description:
The aim of hyper-realistic media is to faithfully represent the physical world. The ultimate goal is to create an experience, which is perceptually indistinguishable from a real scene. Traditional technologies can only capture a fraction of the audio-visual information, limiting the realism of the experience. Recent innovations in computers and audio-visual technology have made it possible to circumvent these bottlenecks in audio-visual systems. As a result, new multimedia signal processing areas have emerged such as light fields, ultra-high definition, high frame rate, high dynamic range imaging and novel 3D audio and sound field technologies. The novel combinations of those technologies can facilitate a hyper-realistic media experience. Without a doubt, this will be the future frontier for new multimedia systems. However, several technological barriers and challenges need to be overcome in developing the best solutions perceptually.
This first ICME workshop on Hyper-Realistic Multimedia for Enhanced Quality of Experience aims at bringing forward recent advances related to capturing, processing, and rendering technologies. The goal is to gather researchers with diverse and interdisciplinary backgrounds to cover the full multimedia signal chain, to efficiently develop truly perceptually enhanced multimedia systems.
Scope and Topics:
We seek unpublished high-quality papers within, but not limited to, the following topics:
- Lightfield, point-cloud, volumetric imaging
- High Dynamic Range imaging, Wide Color Gamut, Ultra High Definition
- Multichannel, 3D audio and sound field systems, audio rendering
- Hyper-realistic display technologies
- Human perception modeling, perceptually-inspired processing
- Processing and coding of hyper-realistic multimedia content
- Subjective and objective quality assessment
- Quality of experience
- Hyper-realism and immersiveness
- Human vision, clinical and experimental psychology and psychophysics
Keynote
Going Deep in Point Cloud Coding
Speaker:
Fernando Pereira, Instituto Superior Técnico, Universidade de Lisboa – Instituto de Telecomunicações
Abstract:
With the rising popularity of virtual and augmented reality applications, 3D visual representation formats such as point clouds (PCs) have become a hot research topic. Since PCs are essentially a set of points in the 3D space with associated features, they are naturally suitable to facilitate user interaction and offer a high level of immersion. However, as providing realistic, interactive and immersive experiences typically requires PCs with a rather large number of points, efficient coding is critical as recognized by standardization groups such as MPEG and JPEG, which have been developing PC coding standards. Scalability is often a requirement for several PC applications where the access time to a PC is relevant, even if at lower quality or resolution, usually by partially decoding a bitstream structured in multiple layers. Although it may come at the cost of a reduced compression efficiency, scalable PC coding is nonetheless a coding paradigm that has been relatively unexplored in the literature.
The popularity of deep learning in multimedia processing tasks has largely increased in recent years due to its impressive performance. In terms of coding, recent deep learning-based image coding solutions offer very promising results, even outperforming state-of-the-art image codecs. Part of this success may be attributed to convolutional neural networks, which take advantage of the spatial redundancy by hierarchically detecting patterns to obtain a more meaningful latent representation. In this context, it is natural to extend the deep learning-based coding approach to PCs, for example coding 3D blocks of voxels instead of 2D blocks of pixels as for image and video coding.
In this context, this talk will address the emerging developments in point cloud coding, notably the recent MPEG and JPEG standardization projects as well as the very recent deep learning-based coding approach, with a special focus on scalability.
Bio:
Fernando Pereira is currently with the Department of Electrical and Computers Engineering of Instituto Superior Técnico and with Instituto de Telecomunicações, Lisbon, Portugal. He is responsible for the participation of IST in many national and international research projects. He acts often as project evaluator and auditor for various organizations. He is Area Editor of the Signal Processing: Image Communication Journal and Associate Editor of the EURASIP Journal on Image and Video Processing, and is or has been a member of the Editorial Board of the Signal Processing Magazine, Associate Editor of IEEE Transactions of Circuits and Systems for Video Technology, IEEE Transactions on Image Processing, IEEE Transactions on Multimedia, and IEEE Signal Processing Magazine. In 2013-2015, he was the Editor-in-Chief of the IEEE Journal of Selected Topics in Signal Processing.He is or has been a member of the IEEE Signal Processing Society Technical Committees on Image, Video and Multidimensional Signal Processing, and Multimedia Signal Processing, and of the IEEE Circuits and Systems Society Technical Committees on Visual Signal Processing and Communications, and Multimedia Systems and Applications. He was an IEEE Distinguished Lecturer in 2005 and elected as an IEEE Fellow in 2008 for “contributions to object-based digital video representation technologies and standards”. He has been elected to serve on the Signal Processing Society Board of Governors in the capacity of Member-at-Large for a 2012 and a 2014-2016 term. Since January 2018, he is the SPS Vice-President for Conferences. Since 2013, he is also a EURASIP Fellow for “contributions to digital video representation technologies and standards”. He has been elected to serve on the European Signal Processing Society Board of Directors for a 2015-2018 term.
Since 2015, he is also a IET Fellow. He is/has been a member of the Scientific and Program Committees of many international conferences and workshops. He has been the General Chair of the Picture Coding Symposium (PCS) in 2007, the Technical Program Co-Chair of the Int. Conference on Image Processing (ICIP) in 2010 and 2016, the Technical Program Chair of the International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) in 2008 and 2012, and the General Chair of the International Conference on Quality of Multimedia Experience (QoMEX) in 2016. He has been participating in the MPEG standardization activities, notably as the head of the Portuguese delegation, chairman of the MPEG Requirements Group, and chairman of many Ad Hoc Groups related to the MPEG-4 and MPEG-7 standards. Since February 2016, he is the JPEG Requirements Chair. He has been one of the key designers of the JPEG Pleno project which targets defining standard representations for several types of plenoptic imaging, notably light fields, point clouds and holograms. He has been developing research on point cloud clustering, coding and quality assessment, and publishing in these areas. He has contributed more than 250 papers in international journals, conferences and workshops, and made several tens of invited talks at conferences and workshops. His areas of interest are video analysis, coding, description and adaptation, and advanced multimedia services.
Schedule:
Friday, July 10 – London (BST time zone)
Keynote
Session Chairs: Søren Forchhammer (Technical University of Denmark) and Frédéric Dufaux (CNRS)
Start | End | Talk | Speaker |
---|---|---|---|
09:00 | 10:00 | Going Deep in Point Cloud Coding | Fernando Pereira (Instituto Superior Técnico, Universidade de Lisboa - Instituto de Telecomunicações) |
10:00 | 10:10 | Break |
Session 1: Multimedia systems
Session Chair: Homer Chen (National Taiwan University)
Start | End | Paper | Author |
---|---|---|---|
10:10 | 10:30 | Time Varying Quality Estimation For HTTP Based Adaptive Video Streaming | Chaminda T E R Hewage1, Maria Martini2 (1Cardiff Metropolitan University, 2Kingston University) |
10:30 | 10:50 | Towards View-Aware Adaptive Streaming Of Holographic Content | Hadi Amirpourazarian1, Christian Timmerer1, Mohammad Ghanbari2 (1Alpen-Adria-Universität Klagenfurt, 2University of Essex) |
10:50 | 11:10 | Creation Of A Hyper-Realistic Remote Music Session With Professional Musicians And Public Audiences Using 5G Commodity Hardware | Alexander Carôt1, Fragkiskos Sardis2, Mischa Dohler2, Simon Saunders2, Navdeep Uniyal3, Richard Cornock4 (1Anhalt University of Applied Science, 2King's College London, 3 University of Bristol, 4Royal Birmingham Conservatoire) |
11:10 | 11:20 | Break |
Session 2: View interpolation/synthesis
Session Chair: Søren Forchhammer (Technical University of Denmark)
Start | End | Paper | Author |
---|---|---|---|
11:20 | 11:40 | A Benchmark Of Light Field View Interpolation Methods | Dingcheng Yue1, Muhammad Shahzeb Khan Gul2, Michel Bätz2, Joachim Keinert2, Rafal Mantiuk1 (1University of Cambridge, 2Fraunhofer IIS) |
11:40 | 12:00 | Pipeline For Real-Time Video View Synthesis | Benoit Vandame, Neus Sabater, Guillaume Boisson, Didier Doyen, Valérie Allié, Frederic Babon, Remy Gendrot, Tristan Langlois, Arno Schubert (InterDigital) |
12:00 | 12:20 | Learning Illumination From A Limited Field-Of-View Image | Yuke Sun, Dan Li, Shuang Liu, Tianchi Cao, Ying-Song Hu (Huazhong University of Science and Technology) |
12:20 | 14:00 | Lunch Break |
Session 3: Quality assessment and perceptual studies
Session Chair: Frédéric Dufaux (CNRS)
Start | End | Paper | Author |
---|---|---|---|
14:00 | 14:20 | No-Reference Quality Evaluation Of Light Field Content Based On Structural Representation Of Epipolar Plane Image | Ali Ak, Suiyi Ling, Patrick Le Callet (University of Nantes) |
14:20 | 14:40 | Towards A Point Cloud Structural Similarity Metric | Evangelos Alexiou, Touradj Ebrahimi (EPFL) |
14:40 | 15:00 | Field-Of-View Effect On The Perceived Quality Of Omnidirectional Images | Falah Jabar, Joao Ascenso, Maria Paula Queluz (Instituto Superior Tecnico, Universidade de Lisboa) |
15:00 | 15:10 | Break | |
15:10 | 15:30 | The Impact Of Screen Resolution Of Hmd On Perceptual Quality Of Immersive Videos | Wenjie Zou1, Lihui Yang1, FuZheng Yang1, Zhibin Ma2, Qiyong Zhao2 (1Xidian University, 2Huawei Technologies Co., Ltd.) |
15:30 | 15:50 | Audio-Visual Perception Of Omnidirectional Video For Virtual Reality Applications | Fang-Yi Chao1, Cagri Ozcinar2, Chen Wang2, Emin Zerman2, Lu Zhang1, Wassim Hamidouche1, Olivier Deforges1, Aljosa Smolic2 (1INSA Rennes, 2Trinity College Dublin) |
15:50 | 16:00 | Break |
Session 4: High Dynamic Range
Session Chair: Ivan V. Bajić (Simon Fraser University)
Start | End | Paper | Author |
---|---|---|---|
16:00 | 16:20 | Hmm-Based Framework To Measure The Visual Fidelity Of Tone Mapping Operators | Waqas Ellahi, Toinon Vigier, Patrick Le Callet (University of Nantes) |
16:20 | 16:40 | Tone Mapping Operators: Progressing Towards Semantic-Awareness | Abhishek Goswami1, Mathis Petrovich1, Wolf Hauser1, Frederic Dufaux2 (1DxO Labs, 2CNRS) |
16:40 | 17:00 | A High-Resolution High Dynamic Range Light-Field Dataset With An Application To View Synthesis And Tone-Mapping | Muhammad Shahzeb Khan Gul, Thorsten Wolf, Michel Bätz, Matthias Ziegler, Joachim Keinert (Fraunhofer IIS) |
17:00 | 17:10 | Break |
Session 5: Processing and display
Session Chair: Xiaolin Wu (McMaster University)
Start | End | Paper | Author |
---|---|---|---|
17:10 | 17:30 | Computational Multifocal Near-Eye Display With Hybrid Refractive-Diffractive Optics | Ugur Akpinar, Erdem Sahin, Atanas Gotchev (Tampere University) |
17:30 | 17:50 | View Synthesis-Based Distributed Light Field Compression | Muhammad Umair Mukati1, Milan Stepanov2, Giuseppe Valenzise3, Frederic Dufaux3, Soren Forchhammer2 (1Technical University of Denmark, 2CentraleSupelec, 3CNRS) |
17:50 | 18:10 | Depth Of Field Image Sequences: 3D Cuing Of High Efficiency | Fangzhou Luo1, Xiao Shu1,2, Xiaolin Wu1 (1McMaster University, 2Shanghai Jiao Tong University) |
Organisers:
- Tian Gan, Shandong University, China
- Wen-Huang Cheng, National Chiao Tung University, Taiwan
- Kai-Lung Hua, National Taiwan University of Science and Technology, Taiwan
- Vladan Velisavljevic, University of Bedfordshire, UK
Description:
The intimate presence of mobile devices in our daily life, such as smartphones and various wearable gadgets like smart watches, has dramatically changed the way we connect with the world around us. Nowadays, in the era of the Internet-of-Things (IoT), these devices are further extended by smart sensors and actuators and amend multimedia devices with additional data and possibilities. With a growing number of powerful embedded mobile sensors like camera, microphone, GPS, gyroscope, accelerometer, digital compass, and proximity sensor, there is a variety of data available and hence enables new sensing applications across diverse research domains comprising mobile media analysis, mobile information retrieval, mobile computer vision, mobile social networks, mobile human-computer interaction, mobile entertainment, mobile gaming, mobile healthcare, mobile learning, and mobile advertising. Therefore, the workshop on Mobile Multimedia Computing (MMC 2018) aims to bring together researchers and professionals from worldwide academia and industry for showcasing, discussing, and reviewing the whole spectrum of technological opportunities, challenges, solutions, and emerging applications in mobile multimedia.
Scope and Topics:
Topics of interest include but are not limited to:
- Ubiquitous computing on mobile and wearable devices
- Mobile visual search
- Action/gesture/object/speech recognition with mobile sensor
- Multimedia data in the IoT
- Computational photography on mobile devices
- Mobile social signal processing
- Human computer interaction with mobile and wearable devices
- Mobile virtual and augmented reality
- Mobile multimedia content adaptation and adaptive streaming
- Mobile multimedia indexing and retrieval
- Power saving issues of mobile multimedia computing
- Multi-modal and multi-user mobile sensing
- Personalization, privacy and security in mobile multimedia
- 2D/3D computer vision on mobile devices
- User behavior analysis of mobile multimedia applications
- Multimedia Cloud Computing
- Other topics related to mobile multimedia computing
Awards:
The MMC Best Paper Award will be granted to the best overall paper. The selection is based on the quality, originality, and clarity of the submission.
Keynote
Application of Machine Learning in Smart Baby Monitor
Speaker:
Prof. Chuan-Yu Chang, National Yunlin University of Science and Technology, Service Systems Technology Center, Industrial Technology Research Institute (ITRI)
Abstract:
Crying is the infant’s first communication. Before learning how to express the emotions or physiological/psychological requirements with language, infants usually express how they feel to parents through crying. According to reports of pediatricians, normal newborns cry two hours a day. However, it is sometimes difficult for parents to figure out why the baby cries. In 2014, we cooperated with the National Taiwan University Hospital Yunlin Branch to develop the “Infant Crying Translator” to identify babies who are hungry, wet diapers, want to sleep, and pain. On the other hand, in order to provide more comprehensive newborn care services, we also develop a comprehensive intelligent baby monitor based on incremental learning.
In this talk, I will introduce the functional development of the smart baby monitor including crying detection, crying analysis, vomiting detection, facial heart rate and breathing detection. We proposed a new deep learning network for cry recognition to break through the limitations of traditional machine learning method and introduce an incremental learning mechanism to shorten the modification of individual crying models. Features such as infant vomiting detection, facial heart rate and breathing detection technology have also been developed to improve the monitoring system for newborns, make it easier for novice parents to care for their babies and reduce accidents.
Bio:
Chuan-Yu Chang received the Ph.D. degree in electrical engineering from National Cheng Kung University, Taiwan, in 2000. He is currently the CTO (Chief Technology Officer) of Service Systems Technology Center, Industrial Technology Research Institute (ITRI), Taiwan. He is also a Distinguished Professor at the Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology (YunTech), Taiwan. He was the Chair of Department of Computer Science and Information Engineering (YunTech) from 2009 to 2011. From 2011 to 2019, he served as the Dean of Research and Development, Director of Incubation Center for Academia-Industry Collaboration and Intellectual Property (YunTech). His current research interests include computational intelligence and their applications to medical image processing, automated optical inspection, emotion recognition, and pattern recognition. In the above areas, he has more than 200 publications in journals and conference proceedings.
He served as the Program co-Chair of TAAI 2007, CVGIP 2009, 2010-2015 International Workshop on Intelligent Sensors and Smart Environments, and the third International Conference on Robot, Vision and Signal Processing (RVSP 2015). He served as General co-chair of 2012 International Conference on Information Security and Intelligent Control, 2011-2013 Workshop on Digital Life Technologies, CVGIP2017, WIC2018, ICS2018, and Finance co-chair of 2013 International Conference on Information, Communications and Signal Processing.
He serves as an Associate Editor for two international journals including Multidimensional Systems and Signal Processing, and International Journal of Control Theory and Applications. He is an IET Fellow, a Life Member of IPPR, TAAI, and a senior Member of IEEE. From 2015-2017, he was the chair of IEEE Signal Processing Society Tainan Chapter and the Representative for Region 10 of IEEE SPS Chapters Committee. He is currently the President of Taiwan Association for Web Intelligence Consortium.
Schedule:
Friday, July 10 – London (BST time zone)
Session 1
Session Chair: Wen-Huang Cheng (National Chiao Tung University)
Start | End | Talk / Paper | Speaker / Author |
---|---|---|---|
11:00 | 12:00 | Keynote: Application of Machine Learning in Smart Baby Monitor | Prof. Chuan-Yu Chang (National Yunlin University of Science and Technology, Service Systems Technology Center, Industrial Technology Research Institute (ITRI)) |
12:00 | 12:10 | Break | |
12:10 | 12:30 | A Highly Efficient And Robust Method For Nnf-Based Template Matching | Yuhai Lan1, Xingchun Xiang2, Huaixuan Zhang2, Shuhan Qi3 (1Huawei Technologies Co. Ltd, 2Tsinghua University, 3Harbin Institute of Technology) |
12:30 | 12:50 | Twinvo: Unsupervised Learning Of Monocular Visual Odometry Using Bi-Direction Twin Network | Xing Cai1, Lanqing Zhang1, Chengyuan Li1, Ge Li1, Thomas H Li2 (1Peking University Shenzhen Graduate School, 2Peking University) |
12:50 | 14:00 | Break |
Session 2
Session Chair: Kai-Lung Hua (National Taiwan University of Science and Technology)
Start | End | Paper | Author |
---|---|---|---|
14:00 | 14:20 | Fractional Step Discriminant Pruning: A Filter Pruning Framework For Deep Convolutional Neural Networks | Nikolaos Gkalelis, Vasileios Mezaris (Information Technologies Institute, Centre for Research and Technology Hellas) |
14:20 | 14:40 | Bayesian Learning For Neural Network Compression | Jen-Tzung Chien, Su-Ting Chang (National Chiao Tung University) |
14:40 | 15:00 | A Highly Efficient Training-Aware Deep Network Compression Paradigm | Chang Liu, Hongtao Lu (Shanghai Jiao Tong University) |
15:00 | 15:10 | Break |
Session 3
Session Chair: Vladan Velisavljevic (University of Bedfordshire)
Start | End | Paper | Author |
---|---|---|---|
15:10 | 15:30 | Deep Restoration Of Invisible QR Code From TPVM Display | Kaihua Song1, Ning Liu1, Zhongpai Gao1, Jiahe Zhang1, Guangtao Zhai1, Xiao-Ping Zhang2 (1Shanghai Jiao Tong University, 2Ryerson University) |
15:30 | 15:50 | A Learning-Based Lowcomplexity In-Loop Filter For Video Coding | Chao Liu1, Heming Sun2, Jiro Katto2, Xiaoyang Zeng1, Yibo Fan1 (1Fudan University, 2Waseda University) |
15:50 | 16:10 | Fine-Grained Image Classification With Coarse And Fine Labels On One-Shot Learning | Qihan Jiao1, Zhi Liu1, Gongyang Li1, Linwei Ye2, Yang Wang2 (1Shanghai University, 2University of Manitoba) |
16:10 | 16:20 | Break |
Session 4 and Best Award Session
Session Chair: Tian Gan (Shandong University)
Start | End | Paper | Author |
---|---|---|---|
16:20 | 16:40 | LRNNET: A Light-Weighted Network With Efficient Reduced Non-Local Operation For Real-Time Semantic Segmentation | Weihao Jiang, Zhaozhi Xie, Yaoyi Li, Chang Liu, Hongtao Lu (Shanghai Jiao Tong University) |
16:40 | 16:50 | Best Paper Award Announcement |
Organisers:
- Prof. Huang-Chia Shih, Yuan Ze University, Taiwan
- Prof. Rainer Lienhart, Augsburg University, Germany
- Prof. Takahiro Ogawa, Hokkaido University, Japan
- Prof. Jenq-Neng Hwang, University of Washington, USA
Description:
Sports data contains enormous potential in revolutionizing the sports industry. Coaches and teams are constantly searching for competitive sports data analytics that utilize AI and computer vision techniques to understand the deeper and hidden semantics of sports. By learning detailed statistics, coaches can assess defensive athletic performance and develop improved strategies. Sports data analytics is the process of analysing spatiotemporal content and sensor data from sports matches in online and offline scenarios. Currently, machine learning is already widely used in the sports industry. Many approaches have been proposed to extract semantic concepts or abstract attributes, such as objects, events, scene types, and captions, from sports videos. However, a limitation of conventional sports data analytics is that the domain-specific model can only be applied to analyse a single sport.
The goal of this workshop is to advance the field of research on the techniques of AI for sports data, develop more techniques to accurately evaluate and organize the data, and further strengthen the synergy between sports and science. Papers about machine learning, vision processing, and data sciences in sports and new forms of sports technologies are encouraged for submission.
Scope and Topics:
Topics of interest include, but are not limited to:
- Object detection/modelling/recognition in sports data
- Athletes motion capturing with learning algorithm in sports
- Activities/actions recognition in sports data
- 3D Sports and AR/VR
- Artificial Intelligence strategy for sports
- Tracking trajectories analysis with learning algorithm in sports
- Semantic analysis in sports data
- Tactics analysis for sports
- Athletes’ decision-making
- Supervised/unsupervised/reinforcement learning for sports data
- Efficient learning algorithm for sports data compression
- Energy- and resource-efficient machine learning architectures for large-scale sports data analytics
- Sports video content analysis in the media cloud
- Performance assessment in sports
- Emerging applications of deep learning in sports content search, retrieval, recommendation, understanding, and summarization
- Future trends and challenges for sports data analytics
- New learning theories and models for sports data analysis and understanding
- Other learning techniques from examples such as imitation learning and emerging cognition system in sports
- New sports database and metrics to evaluate the benefit of sports analytics system
- Survey papers regarding the topic of sports data analytics
Sponsors:
Ministry of Science and Technology Taiwan
Chinese Image Processing Pattern Recognition Society
Institute of Information & Computing Machinery, Taiwan.
Schedule:
Friday, July 10 – London (BST time zone)
Session Chair: Huang-Chia Shih (Yuan Ze University)
Start | End | Paper | Authors |
---|---|---|---|
11:00 | 11:05 | Opening | |
11:05 | 11:25 | Efficient Fitness Action Analysis Based on Spatio-temporal Feature Encoding | Jianwei Li1, Hainan Cui2, Tianxiao Guo1, Qingrui Hu1, Yanfei Shen1 (1Beijing Sport University, China, 2National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences) |
11:25 | 11:45 | Automatic Key Moment Extraction and Highlights Generation based on Comprehensive Soccer Video understanding | Xin Gao, Xusheng Liu, Taotao Yang, Guilin Deng, Hao Peng, Qiaosong Zhang, Hai Li, Junhui Liu (iQIYI Inc, China) |
11:45 | 12:05 | Robust Estimation of Flight Parameters for Ski Jumpers | Katja Ludwig, Moritz Einfalt, Rainer Lienhart (University of Augsburg, Germany) |
12:05 | 12:15 | Coffee break | |
12:15 | 12:35 | MVGAN Maximizing Time-lag Aware Canonical Correlation for Baseball Highlight Generation | Kaito Hirasawa, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama (Hokkaido University, Japan) |
12:35 | 12:40 | Closing remarks |
Organisers:
- John See, Multimedia University, Malaysia
- Weiyao Lin, Shanghai Jiao Tong University, China
- Xiatian Zhu, Samsung AI Centre, UK
Description:
With the rapid growth of video surveillance applications and services, the amount of surveillance videos has become extremely “big” which makes human monitoring tedious and difficult. Therefore, there exists a huge demand for smart surveillance techniques which can perform monitoring in an automatic or semi-automatic way. A number of challenges have arisen in the area of big surveillance data analysis and processing. Firstly, with the huge amount of surveillance videos in storage, video analysis tasks such as event detection, action recognition, and video summarization are of increasing importance in applications including events-of-interest retrieval and abnormality detection. Secondly, semantic data (e.g. objects’ trajectory and bounding boxes) has become an essential data type in surveillance systems owing much to the growth of its size and complexity, hence introducing new challenging topics, such as efficient semantic data processing and compression, to the community. Thirdly, with the rapid growth from the static centric-based processing to the dynamic computing among distributed video processing nodes/cameras, new challenges such as multi-camera analysis, person re-identification, or distributed video processing are being issued in front of us. To meet these challenges, there is great need to extend existing approaches or explore new feasible techniques.
Scope and Topics:
Topics of interest include, but are not limited to:
- Event detection, action recognition, and activity analysis in surveillance videos
- Multi-camera analysis and recognition
- Object detection and tracking in surveillance videos
- Recognition and parsing of crowded scenes
- Person or group re-identification
- Summarization and synopsis on surveillance videos
- Surveillance scene parsing, segmentation, and analysis
- Semantic data processing in large-scale surveillance systems
- Data compression in surveillance systems
- Robust face recognition and detection under low-resolution surveillance videos
- Restoration and enhancement of degradations in low-quality surveillance videos
Keynote
Weak Person Re-identification
Speaker:
Prof. Wei-Shi Zheng, Sun Yat-sen University
Abstract:
Person re-identification (re-id) is an important research topic for associating persons across non-overlapping camera views in visual surveillance. While person re-id has a rapid development in the last ten years, person re-id is still suffering from many unresolved serious influences, such as illumination and clothing change. In addition, at present most of the performance of re-id algorithms heavily depends on the annotation of mass data, and how to deal with a large number of weak annotation or the re-identification of person re-id under no annotation data is still an urgent challenge to solve. In this talk, we will introduce the weak person re-identification research, including weakly supervised solutions for the solving re-id with weak labels and some new models to solve the re-id with weak visual cues.
Bio:
Dr. Wei-Shi Zheng is now a Professor with Sun Yat-sen University. Dr. Zheng received the PhD degree in Applied Mathematics from Sun Yat-sen University in 2008. He is now a full Professor at Sun Yat-sen University. He has now published more than 120 papers, including more than 100 publications in main journals (including 12 TPAMI/IJCV papers) and top conferences (ICCV, CVPR, IJCAI, AAAI). He has joined the organisation of four tutorial presentations in ACCV 2012, ICPR 2012, ICCV 2013 and CVPR 2015. His research interests include person/object association and activity understanding in visual surveillance, and the related large-scale machine learning algorithm. Especially, Dr. Zheng has active research on person re-identification in the last five years. He serves a lot for many journals and conference, and he was announced to perform outstanding review in recent top conferences (ECCV 2016 & CVPR 2017). He has ever joined Microsoft Research Asia Young Faculty Visiting Programme. He has ever served as a senior PC/area chair/associate editor of CVPR, AAAI, IJCAI and BMVC. He is an IEEE MSA TC member. He is an associate editor of Pattern Recognition. He is a recipient of Excellent Young Scientists Fund of the National Natural Science Foundation of China, and a recipient of Royal Society-Newton Advanced Fellowship of United Kingdom.
Keynote
Addition, Subtraction and Geometry in Person Re-identification
Speaker:
Dr. Yang Hua, Queens University of Belfast
Abstract:
In surveillance, person re-identification (re-id) has emerged as a fundamental capability that no tracking system aiming to operate over a wide area network of disjoint cameras can, concretely, renounce to have. Performing person re-identification is a challenging task because of the presence of many sources of appearance variability like lighting, pose, viewpoint, occlusions, especially in the outdoor environment, where they are even more unrestrained.
In this talk, based on real-world scenarios, we address person re-id problems from the following straight-forward but effective intuitions:
-
Addition. Video-based person re-id deals with the inherent difficulty of matching unregulated sequences with different lengths and with incomplete target pose/viewpoint structure. To this end, we propose a novel approach that can exploit more effectively the rich video information by **addition**. Specifically, we complement the original pose-incomplete information carried by the sequences with synthetic GAN-generated images, and fuse their feature vectors into a more discriminative viewpoint insensitive embedding.
-
Subtraction. In contrast, video-based person re-id suffers from low-quality frames, caused by severe motion blur, occlusion, distractors, etc. To address this, we introduce Class-Aware Attention (CAA) in deep metric learning that **subtract** abnormal and trivial samples in video sequences.
-
Geometry. The viewpoint variability across a network of non-overlapping cameras is a challenging problem affecting person re-id performance. We investigate how to mitigate the cross-view ambiguity by learning highly discriminative deep features with **geometric** information. The proposed objective is made up of two terms, the Steering Meta Center (SMC) term and the Enhancing Centers Dispersion (ECD) term that steer the training process to mining effective intra-class and inter-class relationships in the feature domain of the identities.
Bio:
Dr Yang Hua is presently a lecturer (equivalent to Assistant Professor) at EEECS / ECIT, Queens University of Belfast, UK. He received his Ph.D. degree from Université Grenoble Alpes / Inria Grenoble Rhône-Alpes, France, funded by Microsoft Research – Inria Joint Center. Before that, he worked as a senior R&D engineer at Panasonic Singapore Laboratories in Singapore for four years. He holds three US patents and one China patent and won four titles of prestigious international competitions, including PASCAL Visual Object Classes (VOC) Challenge Classification Competition and the Thermal Imagery Visual Object Tracking (VOT-TIR) Competition.
Schedule:
Friday, July 10 – London (BST time zone)
Keynote
Session Chair: Xiatian Zhu (Samsung AI Centre)
Start | End | Talk | Speaker |
---|---|---|---|
14:00 | 14:10 | Opening remarks | |
14:10 | 14:50 | Keynote: Weak Person Re-identification | Dr. Wei-Shi Zheng (Sun Yat-Sen University, China) |
Session 1: Object Detection and Manipulation in Big Surveillance
Session Chair: John See (Multimedia University) and Weiyao Lin (Shanghai Jiao Tong University)
Start | End | Paper | Author |
---|---|---|---|
14:50 | 15:10 | A Regional Regression Network For Monocular Object Distance Estimation | Yufeng Zhang, Yuxi Li, Mingbi Zhao, Xiaoyuan Yu (Shanghai Jiao Tong University) |
15:10 | 15:30 | OD-GCN: Object Detection Boosted By Knowledge GCN | Zheng Liu1, Zidong Jiang1, Wei Feng1, Hui Feng2 (1iQIYI Inc, 2Fudan University) |
15:30 | 15:50 | Comparing CNN-Based Object Detectors On Two Novel Maritime Datasets | Valentine Soloviev1, Fahimeh Farahnakian2, Luca Zelioli, 1Bogdan Iancu, 1Johan Lilius, 1Jukka Heikkonen2 (1Abo Akademi University, 2University of Turku) |
15:50 | 16:10 | Semi-Blind Super-Resolution With Kernel-Guided Feature Modification | Gongping Li, Yao Lu, Lihua Lu, Ziwei Wu, Xuebo Wang, Shunzhou Wang (Beijing Institute of Technology) |
16:10 | 16:20 | Break |
Keynote 2
Session Chair: Xiatian Zhu (Samsung AI Centre)
Start | End | Talk | Speaker |
---|---|---|---|
16:20 | 17:00 | Keynote: Addition, Subtraction and Geometry in Person Re-identification | Dr. Yang Hua (Queens University of Belfast, UK) |
Session 2: Tracking and Event Processing in Big Surveillance
Session Chair: John See (Multimedia University) and Weiyao Lin (Shanghai Jiao Tong University)
Start | End | Paper | Author |
---|---|---|---|
17:00 | 17:20 | Adaptive Depth Network For Crowd Counting And Beyond | Liangzi Rong, Chunping Li (Tsinghua University) |
17:20 | 17:40 | Abnormal Event Detection In Surveillance Videos Using Two-Stream Decoder | Herman Prawiro1, Jian-Wei Peng2, Tse-Yu Pan1, Min-Chun Hu1 (1National Tsing Hua University, 2National Cheng Kung University) |
17:40 | 18:00 | Effect Of Video Transcoding Parameters On Visual Object Tracking For Surveillance Systems | Taieb Chachou1, Sid Ahmed Fezza2, Ghalem Belalem1, Wassim Hamidouche3 (1Universit Oran, 2National Institute of Telecommunications and ICT, 3Université de Rennes) |
18:00 | 18:10 | Closing Remarks |
Organisers:
- Prof. Yun Zhang, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China
- Prof. Raouf Hamzaoui, De Montfort University, UK
- Prof. C.-C. Jay Kuo, University of Southern California, USA
- Prof. Dietmar Saupe, University of Konstanz, Germany
Sponsor:
- SFB-TRR 161
Description:
The Picture-wise Just Noticeable Difference (PJND) for a given subject, image/video, and compression scheme is the smallest distortion that the subject can perceive when the image/video is compressed with this compression scheme. The PJND is normally determined with subjective quality assessment tests for a large population of viewers. Knowing the PJND statistics allows to reduce the bitrate without perceptual quality loss for the chosen proportion of the population. The workshop seeks papers proposing novel techniques to determine or predict the PJND statistics, as well as using these statistics for image/video processing, compression, and communication. While the focus of the workshop is on the PJND concept, contributions to the conventional JND approach where a JND threshold is computed at the pixel or subband level are also welcome provided the work is data driven.
Scope and Topics:
Topics of interest include, but are not limited to:
- PJND/JND datasets for images, video, 3D video, omni-directional images/video, and point clouds
- PJND/JND visual attributes related to contents, displays, the environment and the human visual system
- Data-driven computational models for PJND/JND
- Machine learning techniques to estimate the PJND/JND
- Evaluation methods and metrics for JND/PJND models
- PJND/JND concept, visual attributes, perception and prediction models
- Data-driven PJND/JND models and their application to visual perception
- PJND/JND models and their application to multimedia signal processing, compression and communication
Invited Talk
Visual Perception and JND Modelling: Progress & Challenges
Speaker:
Weisi Lin, Nanyang Technological University, Singapore
Abstract:
Just-noticeable difference (JNDs), as perceptual thresholds of visibility, determine the minimal amounts of change for differences to be sensed by the human being (e.g., 75% of a population), and play an important role both explicitly and implicitly in many applications. The measurement, formulation and computationally-modeling for JND are the prerequisite for user-centric designs for turning human perceptual limitation into meaningful system advantages. In this talk, a holistic view will be presented on visual JND research and practice: absolute and utility-oriented JNDs; pixel-, subband- and picture- based JNDs; conventional and data-driven JND estimation; databases and model evaluation. Other factors influencing JND include culture and personality, as will be also highlighted. JND modeling for visual signals (naturally captured, computer-generated or mixed ones) has attracted much research interests so far, while those for audio, haptics, olfaction and gestation are expected to attract increasing research interests toward true multimedia. Possible new directions are then to be discussed in order to advance the relevant research.
Bio:
Weisi Lin researches in image processing, perception-based signal modelling and assessment, video compression, and multimedia communication systems. In the said areas, he has published 240+ international journal papers and 260+ international conference papers, 9 patents, 9 book chapters, 2 authored books and 3 edited books, as well as excellent track record in leading and delivering more than 10 major funded projects (with over S$7m research funding). He earned his BSc and MSc from Sun Yat-Sen University, China, and Ph.D from King’s College, University of London. He had been the Lab Head, Visual Processing, Institute for Infocomm Research (I2R). He is a Professor in School of Computer Science and Engineering, Nanyang Technological University, where he served as the Associate Chair (Graduate Studies) in 2013-2014.
He is a Fellow of IEEE and IET, and an Honorary Fellow of Singapore Institute of Engineering Technologists. He has been awarded Highly Cited Researcher 2019 by Web of Science, and elected as a Distinguished Lecturer in both IEEE Circuits and Systems Society (2016-17) and Asia-Pacific Signal and Information Processing Association (2012-13), and given keynote/invited/tutorial/panel talks to 20+ international conferences during the past 10 years. He has been an Associate Editor for IEEE Trans. on Image Processing, IEEE Trans. on Circuits and Systems for Video Technology, IEEE Trans. on Multimedia, IEEE Signal Processing Letters, Quality and User Experience, and Journal of Visual Communication and Image Representation. He was also the Guest Editor for 7 special issues in international journals, and chaired the IEEE MMTC QoE Interest Group (2012-2014); he has been a Technical Program Chair for IEEE Int’l Conf. Multimedia and Expo (ICME 2013), International Workshop on Quality of Multimedia Experience (QoMEX 2014), International Packet Video Workshop (PV 2015), Pacific-Rim Conf. on Multimedia (PCM 2012) and IEEE Visual Communications and Image Processing (VCIP 2017). He believes that good theory is practical, and has delivered 10+ major systems and modules for industrial deployment with the technology developed.
Schedule:
Friday, July 10 – London (BST time zone)
Session 1
Session Chair: Dietmar Saupe (University of Konstanz)
Start | End | Talk | Speaker |
---|---|---|---|
14:00 | 15:00 | Visual Perception and JND Modelling: Progress & Challenges | Prof. Weisi Lin (Nanyang Technological University, Singapore) |
15:00 | 15:10 | Break |
Session 2
Session Chair: Yun Zhang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)
Start | End | Paper | Author |
---|---|---|---|
15:10 | 15:30 | A JND Dataset Based On VVC Compressed Images | Xuelin Shen1, Zhangkau Ni1, Wenhan Yang1, Xinfeng Zhang2, Shiqi Wang1, Sam Kwong1 (1City University of Hong Kong, 2University of Chinese Academic of Sciences) |
15:30 | 15:50 | Unsupervised Deep Learning For Just Noticeable Difference Estimation | Yuhao Wu, Weiping Ji, Jinjian Wu (Xidian University) |
15:50 | 16:10 | Subjective Assessment Of Global Picture-Wise Just Noticeable Difference | Hanhe Lin1, Mohsen Jenadeleh1, Guangan Chen1, Ulf-Dietrich Reips1, Raouf Hamzaoui2, Dietmar Saupe1 (1University of Konstanz, 2De Montfort University) |
16:10 | 16:20 | Break |
Session 3
Session Chair: Dietmar Saupe (University of Konstanz)
Start | End | Paper | Author |
---|---|---|---|
16:20 | 16:40 | JUNIPER: A JND Based Perceptual Video Coding Framework To Jointly Utilize Saliency And JND | Sanaz Nami, Farhad Pakdaman, Mahmoud R. Hashemi (University of Tehran) |
16:40 | 17:00 | Satisfied User Ratio Prediction With Support Vector Regression For Compressed Stereo Images | Chunling Fan1, Yun Zhang1, Raouf Hamzaoui2, Djemel Ziou3, Qingshan Jiang1 (1Shenzhen Institutes of Advanced Technology, 2De Montfort University, 3Universite de Sherbrooke) |
Organisers:
- Pradeep K. Atrey, University at Albany, State University of New York (SUNY), USA
- Nitin Khanna, Indian Institute of Technology, Gandhinagar, India
- Nalini K. Ratha, IBM Thomas J. Watson Research Center, USA
- Luisa Verdoliva, University Federico II of Naples, Italy
- Christian von der Weth, National University of Singapore, Singapore
Description:
Fake news is a type of social hacking designed to change a reader’s point of view, the effect of which may lead them to change their opinion about an individual, an organization, or a belief, and make misinformed decisions. With the advent of multimedia editing tools, fake news typically contains multiple types of media such as text, image, video and audio. Media-rich fake news can be easily made to look like a real one. Further, fakes news is prone to abrupt dissemination through increasing accessibility of the internet and online social media outlets. Although there has been a significant progress in multimedia security and forensics research, the modern web and social media avenues for creation and sharing of multimedia content poses fresh challenges related to fake content identification and mitigation. This workshop aims to bring forward further advances in the area of fake multimedia in terms of its proactive identification and the prevention of spread of such content.
Scope and Topics:
We invite latest and high-quality papers presenting or addressing issues related to media-rick fake news, but not limited to:
- Media-rich fake email detection and prevention.
- Media-rich fake news identification over social media.
- Media-rich fake news mitigation over social media.
- Content management policy for news publishers.
- Content filtering for web.
- Impact and severity of fake content.
- Secure models, policies and practices for safe content filtering.
- Identification and credibility of the author and the publishing source of fake content.
- Fake content alert mechanisms.
Panel
DeFaking Multimedia
Abstract:
This panel focuses on two aspects of fake multimedia content: detection mechanisms and prevention strategies. The panelists will share their views on current challenges of defaking multimedia content and discuss the state-of-the-art detection and prevention methods.
Panelists:
Pradeep K. Atrey is an associate professor at the State University of New York, Albany, NY, USA. He is also the director of computer science undergraduate program and the founding co-director of Albany Lab for Privacy and Security (ALPS). Previously he was an associate professor at the University of Winnipeg, Canada. He received his Ph.D. in Computer Science from the National University of Singapore. He was a postdoctoral researcher at the Multimedia Communications Research Laboratory, University of Ottawa, Canada. His current research interests are in the area of security and privacy with a focus on multimedia surveillance and privacy, multimedia security, secure-domain cloud-based large-scale multimedia analytics, and social media. He has authored/co-authored over 125 research articles at reputed ACM, IEEE, and Springer journals and conferences. His research has been funded by Canadian Govt. agencies NSERC and DFAIT, and by Govt. of Saudi Arabia. Dr. Atrey has recieved many awards including the ACM TOMM Associate Editor of the Year (2015), the IEEE Comm. Soc. MMTC Best R-Letter Editor Award (2015), the Erica and Arnold Rogers Award for Excellence in Research and Scholarship (2014), and ETRI Journal Best Editor Award (2012). He was also recognized as the ACM Multimedia Rising Star (2015).
Siwei Lyu is a Professor at the Department of Computer Science and the founding Director of Computer Vision and Machine Learning Lab (CVML) of University at Albany, State University of New York.
Dr. Lyu was an Assistant Professor from 2008 to 2014, and a tenured Associate Professor from 2014 to 2019, in the same department. From 2005 to 2008, he was a Post-Doctoral Research Associate at the Howard Hughes Medical Institute and the Center for Neural Science of New York University. He was an Assistant Researcher at Microsoft Research Asia (then Microsoft Research China) in 2001. Dr. Lyu received his Ph.D. degree in Computer Science from Dartmouth College in 2005, and his M.S. degree in Computer Science in 2000 and B.S. degree in Information Science in 1997, both from Peking University, China.
Dr. Lyu’s research interests include digital media forensics, computer vision, and machine learning. Dr. Lyu has published over 130 refereed journal and conference papers. Dr. Lyu’s research projects are funded by NSF, DARPA, NIJ, UTRC, IBM and University at Albany, SUNY. He is the recipient of the IEEE Signal Processing Society Best Paper Award (2011), the National Science Foundation CAREER Award (2010), SUNY Albany’s Presidential Award for Excellence in Research and Creative Activities (2017), SUNY Chancellor’s Award for Excellence in Research and Creative Activities (2018) and Google Faculty Research Award (2019).
Dr. Lyu currently serves on the IEEE Signal Processing Society’s Information Forensics and Security Technical Committee, and is on the Editorial Board of IEEE Transactions on Information Forensics and Security. Dr. Lyu is a senior member of IEEE, a member of ACM, a member of Sigma Xi, and a member of Omicron Delta Kappa.
Nasir Memon is a professor in the Department of Computer Science and Engineering at NYU Tandon. He is co-founder of the Center for Cyber Security (CCS), both in New York and Abu Dhabi. CCS is a collaborative initiative of multiple schools within NYU including the Law School. He is the founder of CSAW, the world’s largest student-run cybersecurity competition, and also the founder of the Offensive Security, Incident Response and Internet Security Laboratory (OSIRIS) lab at NYU Tandon. His research interests include digital forensics, authentication, biometrics, data compression, network security and security and human behavior. Professor Memon has published over 250 articles in journals and conference proceedings and holds a dozen patents in image compression and security. He has won several awards including the Jacobs Excellence in Education award and several best paper awards. He has been on the editorial boards of several journals and was the Editor-In-Chief of IEEE Transactions on Information Security and Forensics. He is an IEEE Fellow and an SPIE fellow.
Luisa Verdoliva is Associate Professor at University Federico II of Naples (Italy). In 2018 she has been visiting professor at Friedrich-Alexander-University (FAU) and in 2019-2020 she has been visiting scientist at Google AI in San Francisco. Her scientific interests are in the field of image processing, with main contributions in the area of multimedia forensics.
She has been the Principal Investigator for University Federico II of Naples in the DISPARITY (Digital, Semantic and Physical Analysis of Media Integrity) project funded by DARPA under the MEDIFOR program (2016-2020). She has been general co-Chair of the 2019 ACM Workshop on Information Hiding and Multimedia Security, technical Chair of the 2019 IEEE Workshop in Information Forensics and Security and Tutorial chair of the 2016 IEEE Workshop in Information Forensics and Security. She is on the Editorial Board of IEEE Transactions on Information Forensics and Security and IEEE Signal Processing Letters. Dr. Verdoliva is vice-Chair of the IEEE Signal Processing Society’s Information Forensics and Security Technical Committee. She is the recipient of a Google Faculty Research Award and a TUM-IAS Hans Fischer Senior Fellowship.
Schedule:
Friday, July 10 – London (BST time zone)
Session Chair: Pradeep Atrey (University at Albany and State University of New York)
Start | End | Paper | Author |
---|---|---|---|
14:00 | 14:20 | DeepFake Detection: Current Challenges and Next Steps | Siwei Lyu (University at Albany, State University of New York) |
14:20 | 14:40 | Information Distribution based Defense against Physical Attacks on Object Detection | Guangzhi Zhou1,2, Hongchao Gao1,2, Peng Chen1,2, Jin Liu1,2, Jiao Dai1, Jizhong Han1, Ruixuan Li3 (1Institute of Information Engineering, Chinese Academy of Sciences, 2School of Cyber Security, University of Chinese Academy of Sciences, 3Huazhong University of Science and Technology) |
14:40 | 15:00 | Nudging Users to Slow Down the Spread of Fake News in Social Media | Christian von der Weth, Jithin Vachery, Mohan Kankanhalli (National University of Singapore) |
15:00 | 15:10 | Break | |
15:10 | 15:55 | Panel on De-Faking Multimedia | Luisa Verdoliva1, Siwei Lyu2, Pradeep Atrey2, Nasir Memon3 (1University Federico II of Naples, 2State University of New York, 3New York University) |