# 今日学术视野(2016.09.27)

compete,
or die.

Mental Models: The Best Way to Make Intelligent Decisions (113 Models
Explained)

cs.AI – 人工智能
cs.CL – 计算与语言
cs.CR – 加密与安全
cs.CV – 机器视觉与模式识别
cs.CY – 计算与社会
cs.DC – 分布式、并行与集群计算
cs.DS – 数据结构与算法
cs.IR – 信息检索
cs.IT – 信息论
cs.LG – 自动学习
cs.SD – 声音处理
cs.SI – 社交网络与信息网络
gr-qc – 广义相对论与量子宇宙学
math.NT – 数论
math.PR – 概率
math.ST – 统计理论
q-bio.QM – 定量方法
stat.AP – 应用统计
stat.ME – 统计方法论
stat.ML – (统计)机器学习

cs.AI – 人工智能
cs.CL – 计算与语言
cs.CR – 加密与安全
cs.CV – 机器视觉与模式识别
cs.CY – 计算与社会
cs.DC – 分布式、并行与集群计算
cs.DL – 数字图书馆
cs.DS – 数据结构与算法
cs.LG – 自动学习
cs.MM – 多媒体
cs.NE – 神经与进化计算
cs.NI – 网络和互联网体系结构
cs.SD – 声音处理
math.OC – 优化与控制
math.ST – 统计理论
stat.AP – 应用统计
stat.ME – 统计方法论
stat.ML – (统计)机器学习

cs.AI – 人工智能
cs.CL – 计算与语言
cs.CV – 机器视觉与模式识别
cs.CY – 计算与社会
cs.DC – 分布式、并行与集群计算
cs.DL – 数字图书馆
cs.DS – 数据结构与算法
cs.IT – 信息论
cs.LG – 自动学习
cs.MM – 多媒体
cs.NE – 神经与进化计算
cs.SE – 软件工程
cs.SI – 社交网络与信息网络
cs.SY – 系统与控制
math.ST – 统计理论
q-bio.NC – 神经元与认知
stat.AP – 应用统计
stat.ME – 统计方法论
stat.ML – (统计)机器学习
stat.OT – 其他统计学

How do you think the most rational people in the world operate their
minds? How do they make better decisions?

• [cs.AI]ECO-AMLP: A Decision Support System using an Enhanced Class
Outlier with Automatic Multilayer Perceptron for Diabetes Prediction
• [cs.AI]Model Selection with Nonlinear Embedding for Unsupervised
• [cs.AI]Toward Goal-Driven Neural Network Models for the Rodent
Whisker-Trigeminal System
• [cs.CL]Comparison of Modified Kneser-Ney and Witten-Bell Smoothing
Techniques in Statistical Language Model of Bahasa Indonesia
• [cs.CL]End-to-end Conversation Modeling Track in DSTC6
• [cs.CL]Named Entity Recognition with stack residual LSTM and
trainable bias decoding
• [cs.CL]Neural Machine Translation with Gumbel-Greedy Decoding
• [cs.CL]Personalization in Goal-Oriented Dialog
• [cs.CR]Integrating self-efficacy into a gamified approach to thwart
phishing attacks
• [cs.CV]Computer-aided implant design for the restoration of cranial
defects
• [cs.CV]Joint Prediction of Depths, Normals and Surface Curvature
from RGB Images using CNNs
• [cs.CV]Listen to Your Face: Inferring Facial Action Units from Audio
Channel
• [cs.CV]Sampling Matters in Deep Embedding Learning
• [cs.CV]Training Adversarial Discriminators for Cross-channel
Abnormal Event Detection in Crowds
• [cs.CY]Computational Controversy
• [cs.CY]Human decisions in moral dilemmas are largely described by
Utilitarianism: virtual car driving study provides guidelines for ADVs
• [cs.CY]Mediated behavioural change in human-machine networks:
exploring network characteristics, trust and motivation
• [cs.DC]Heterogeneous MPSoCs for Mixed Criticality Systems:
Challenges and Opportunities
• [cs.DC]Interoperable Convergence of Storage, Networking and
Computation
• [cs.DC]Optimizing the Performance of Reactive Molecular Dynamics
Simulations for Multi-Core Architectures
• [cs.DS]Testing Piecewise Functions
• [cs.IR]Causal Embeddings for Recommendation
• [cs.IR]Comparing Neural and Attractiveness-based Visual Features for
Artwork Recommendation
• [cs.IR]Contextual Sequence Modeling for Recommendation with
Recurrent Neural Networks
• [cs.IR]Specializing Joint Representations for the task of Product
Recommendation
• [cs.IT]A Combinatorial Methodology for Optimizing Non-Binary
Graph-Based Codes: Theoretical Analysis and Applications in Data
Storage
• [cs.IT]Common-Message Broadcast Channels with Feedback in the
Nonasymptotic Regime: Full Feedback
• [cs.IT]Communication-Aware Computing for Edge Processing
• [cs.IT]Fundamental Limits of Universal Variable-to-Fixed Length
Coding of Parametric Sources
• [cs.IT]Fundamental Limits on Delivery Time in Cloud- and Cache-Aided
Heterogeneous Networks
• [cs.IT]High Performance Non-Binary Spatially-Coupled Codes for Flash
Memories
• [cs.IT]On Single-Antenna Rayleigh Block-Fading Channels at Finite
Blocklength
• [cs.IT]Retrodirective Multi-User Wireless Power Transfer with
Massive MIMO
• [cs.LG]Efficient Approximate Solutions to Mutual Information Based
Global Feature Selection
• [cs.LG]How Much Data is Enough? A Statistical Approach with Case
Study on Longitudinal Driving Behavior
• [cs.SD]Revisiting Autotagging Toward Faultless Instrumental
Playlists Generation
• [cs.SI]Information Diffusion in Social Networks in Two Phases
• [gr-qc]Deep Transfer Learning: A new deep learning glitch
• [math.NT]New cubic self-dual codes of length 54, 60 and 66
• [math.PR]Global algorithms for maximal eigenpair
• [math.ST]Asymmetric Matrix-Valued Covariances for Multivariate
Random Fields on Spheres
• [math.ST]Consistent Estimation in General Sublinear Preferential
Attachment Trees
• [math.ST]Multi-sequence segmentation via score and higher-criticism
tests
• [math.ST]Nonparametric Bayesian estimation of a Hölder continuous
diffusion coefficient
• [math.ST]Shape-constrained partial identification of a population
mean under unknown probabilities of sample selection
• [q-bio.QM]Cross-validation failure: small sample sizes lead to large
error bars
• [stat.AP]A Bayesian approach to modeling mortgage default and
prepayment
• [stat.AP]Causal Inference in Travel Demand Modeling (and the lack
thereof)
• [stat.AP]The Cost of Transportation : Spatial Analysis of US Fuel
Prices
• [stat.ME]Asymptotics of ABC
• [stat.ME]Bayesian Penalized Regression
• [stat.ME]Estimation and adaptive-to-model testing for regressions
with diverging number of predictors
• [stat.ME]Model choice in separate families: A comparison between the
FBST and the Cox test
• [stat.ME]Multivariate Geometric Skew-Normal Distribution
• [stat.ME]Pathwise Least Angle Regression and a Significance Test for
the Elastic Net
• [stat.ME]Point and Interval Estimation of Weibull Parameters Based
on Joint Progressively Censored Data
• [stat.ME]Spatially filtered unconditional quantile regression
• [stat.ML]A Variance Maximization Criterion for Active Learning
• [stat.ML]A-NICE-MC: Adversarial Training for MCMC
• [stat.ML]Query Complexity of Clustering with Side Information

• [cs.AI]Regulating Reward Training by Means of Certainty Prediction
in a Neural Network-Implemented Pong Game
• [cs.CL]AMR-to-text generation as a Traveling Salesman Problem
• [cs.CL]Deep Multi-Task Learning with Shared Memory
• [cs.CL]Incorporating Relation Paths in Neural Relation Extraction
• [cs.CL]Language as a Latent Variable: Discrete Generative Models for
Sentence Compression
• [cs.CR]Building accurate HAV exploiting User Profiling and Sentiment
Analysis
• [cs.CV]EFANNA : An Extremely Fast Approximate Nearest Neighbor
Search Algorithm Based on kNN Graph
• [cs.CV]EgoCap: Egocentric Marker-less Motion Capture with Two
Fisheye Cameras
• [cs.CV]Example-Based Image Synthesis via Randomized Patch-Matching
• [cs.CV]Funnel-Structured Cascade for Multi-View Face Detection with
Alignment-Awareness
• [cs.CV]Real-time Human Pose Estimation from Video with Convolutional
Neural Networks
• [cs.CV]The face-space duality hypothesis: a computational model
• [cs.CY]On the (im)possibility of fairness
• [cs.CY]Tracking the Trackers: Towards Understanding the Mobile
• [cs.DC]MPI Parallelization of the Resistive Wall Code STARWALL:
Report of the EUROfusion High Level Support Team Project JORSTAR
• [cs.DL]OCR++: A Robust Framework For Information Extraction from
Scholarly Articles
• [cs.DS]Scheduling Under Power and Energy Constraints
• [cs.LG]A Novel Progressive Multi-label Classifier for
Classincremental Data
• [cs.LG]Multilayer Spectral Graph Clustering via Convex Layer
Aggregation
• [cs.LG]Using Neural Network Formalism to Solve Multiple-Instance
Problems
• [cs.MM]Deep Quality: A Deep No-reference Quality Assessment System
• [cs.NE]Deep Learning in Multi-Layer Architectures of Dense Nuclei
• [cs.NE]Multi-Output Artificial Neural Network for Storm Surge
Prediction in North Carolina
• [cs.NI]Hydra: Leveraging Functional Slicing for Efficient
Distributed SDN Controllers
• [cs.SD]Discovering Sound Concepts and Acoustic Relations In Text
• [cs.SD]Novel stochastic properties of the short-time spectrum for
unvoiced pronunciation modeling and synthesis
• [math.OC]Screening Rules for Convex Problems
• [math.ST]A Wald-type test statistic for testing linear hypothesis in
logistic regression models based on minimum density power divergence
estimator
• [math.ST]On the Non-Existence of Unbiased Estimators in Constrained
Estimation Problems
• [math.ST]Robust Confidence Intervals in High-Dimensional
Left-Censored Regression
• [stat.AP]A Few Photons Among Many: Unmixing Signal and Noise for
Photon-Efficient Active Imaging
• [stat.AP]Predicting human-driving behavior to help driverless
vehicles drive: random intercept Bayesian Additive Regression Trees
• [stat.ME]Changepoint Detection in the Presence of Outliers
• [stat.ME]Efficient Feature Selection With Large and High-dimensional
Data
• [stat.ME]Fully Bayesian Estimation and Variable Selection in
Partially Linear Wavelet Models
• [stat.ME]Semiparametric clustered overdispersed multinomial
goodness-of-fit of log-linear models
• [stat.ME]Statistical Modeling for Spatio-Temporal Degradation Data
• [stat.ML]A penalized likelihood method for classification with
matrix-valued predictors
• [stat.ML]Constraint-Based Clustering Selection
• [stat.ML]Estimating Probability Distributions using “Dirac” Kernels
• [stat.ML]One-vs-Each Approximation to Softmax for Scalable
Estimation of Probabilities

• [cs.AI]Open Problem: Approximate Planning of POMDPs in the class of
Memoryless Policies
• [cs.AI]Practical optimal experiment design with probabilistic
programs
• [cs.CL]An Efficient Character-Level Neural Machine Translation
• [cs.CL]Cohesion and Coalition Formation in the European Parliament:
• [cs.CL]Ensemble of Jointly Trained Deep Neural Network-Based
Acoustic Models for Reverberant Speech Recognition
• [cs.CL]Proceedings of the LexSem+Logics Workshop 2016
• [cs.CL]The Roles of Path-based and Distributional Information in
Recognizing Lexical Semantic Relations
• [cs.CV]An image compression and encryption scheme based on deep
learning
• [cs.CV]Frame- and Segment-Level Features and Candidate Pool
Evaluation for Video Caption Generation
• [cs.CV]Geometry-aware Similarity Learning on SPD Manifolds for
Visual Recognition
• [cs.CV]Globally Variance-Constrained Sparse Representation for Image
Set Compression
• [cs.CV]Large Angle based Skeleton Extraction for 3D Animation
• [cs.CY]Modelling Student Behavior using Granular Large Scale Action
Data from a MOOC
• [cs.DC]Safe Serializable Secure Scheduling: Transactions and the
• [cs.DC]The BioDynaMo Project: Creating a Platform for Large-Scale
Reproducible Biological Simulations
• [cs.DL]Anomalies in the peer-review system: A case study of the
journal of High Energy Physics
• [cs.DS]Faster Sublinear Algorithms using Conditional Sampling
• [cs.DS]Lecture Notes on Spectral Graph Methods
• [cs.IT]Hard Clusters Maximize Mutual Information
• [cs.LG]Application of multiview techniques to NHANES dataset
• [cs.LG]Dynamic Collaborative Filtering with Compound Poisson
Factorization
• [cs.LG]Mollifying Networks
• [cs.LG]Reinforcement Learning algorithms for regret minimization in
structured Markov Decision Processes
• [cs.MM]Towards Music Captioning: Generating Music Playlist
Descriptions
• [cs.NE]Power Series Classification: A Hybrid of LSTM and a Novel
• [cs.SE]A Proposal for the Measurement and Documentation of Research
Software Sustainability in Interactive Metadata Repositories
• [cs.SI]Feature Driven and Point Process Approaches for Popularity
Prediction
• [cs.SI]Learning Latent Local Conversation Modes for Predicting
Community Endorsement in Online Discussions
• [cs.SI]Observer Placement for Source Localization: The Effect of
Budgets and Transmission Variance
• [cs.SY]Graph Distances and Controllability of Networks
• [math.ST]Adaptive confidence sets for matrix completion
• [math.ST]Bayesian Posteriors For Arbitrarily Rare Events
• [q-bio.NC]A Three Spatial Dimension Wave Latent Force Model for
Describing Excitation Sources and Electric Potentials Produced by Deep
Brain Stimulation
• [stat.AP]The Use of Minimal Spanning Trees in Particle Physics
• [stat.AP]Tweedie distributions for fitting semicontinuous health
care utilization cost data
• [stat.ME]A Cautionary Tale: Mediation Analysis Applied to Censored
Survival Data
• [stat.ME]A Measure of Directional Outlyingness with Applications to
Image Data and Video
• [stat.ME]Exact balanced random imputation for sample survey data
• [stat.ME]Globally Homogenous Mixture Components and Local
Heterogeneity of Rank Data
• [stat.ML]A Convolutional Autoencoder for Multi-Subject fMRI Data
Aggregation
• [stat.ML]Clustering Mixed Datasets Using Homogeneity Analysis with
Applications to Big Data
• [stat.ML]Faster Principal Component Regression via Optimal
Polynomial Approximation to sgn(x)
• [stat.ML]Large-scale Learning With Global Non-Decomposable
Objectives
• [stat.ML]Outlier Detection on Mixed-Type Data: An Energy-based
Approach
• [stat.OT]Putting Down Roots: A Graphical Exploration of Community
Attachment

They do it by mentally filing away a massive, but finite amount of
fundamental, unchanging knowledge that can be used in evaluating the
infinite number of unique scenarios which show up in the real world.

·····································

·····································

·····································

That is how consistently rational and effective thinking is done, and if
we want to learn how to think properly ourselves, we need to figure out
how it’s done. Fortunately, there is a way, and it works.

• [cs.AI]ECO-AMLP: A Decision Support System using an Enhanced Class
Outlier with Automatic Multilayer Perceptron for Diabetes Prediction

Maham Jahangir, Hammad Afzal, Mehreen Ahmed, Khawar Khurshid, Raheel
Nawaz

http://arxiv.org/abs/1706.07679v1

• [cs.AI]Regulating Reward Training by Means of Certainty Prediction
in a Neural Network-Implemented Pong Game

Matt Oberdorfer, Matt Abuzalaf
http://arxiv.org/abs/1609.07434v1

• [cs.AI]Open Problem: Approximate Planning of POMDPs in the class
of Memoryless Policies

Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar
http://arxiv.org/abs/1608.04996v1

Before we dig deeper, let’s start by watching this short video on a
concept called mental models. Then continue on below.

With advanced data analytical techniques, efforts for more accurate
decision support systems for disease prediction are on rise. Surveys
by World Health Organization (WHO) indicate a great increase in number
of diabetic patients and related deaths each year. Early diagnosis of
diabetes is a major concern among researchers and practitioners. The
paper presents an application of \textit{Automatic Multilayer
Perceptron }which\textit{ }is combined with an outlier detection
method \textit{Enhanced Class Outlier Detection using distance based
algorithm }to create a prediction framework named as Enhanced Class
Outlier with Automatic Multi layer Perceptron (ECO-AMLP). A series of
experiments are performed on publicly available Pima Indian Diabetes
Dataset to compare ECO-AMLP with other individual classifiers as well
as ensemble based methods. The outlier technique used in our framework
gave better results as compared to other pre-processing and
classification techniques. Finally, the results are compared with
other state-of-the-art methods reported in literature for diabetes
prediction on PIDD and achieved accuracy of 88.7% bests all other
reported studies.

We present the first reinforcement-learning model to self-improve its
reward-modulated training implemented through a continuously improving
“intuition” neural network. An agent was trained how to play the
arcade video game Pong with two reward-based alternatives, one where
the paddle was placed randomly during training, and a second where the
such that it could develop a sense of “certainty” as to how probable
its own predicted paddle position will be to return the ball. If the
agent was less than 95% certain to return the ball, the policy used an
intuition neural network to place the paddle. We trained both
architectures for an equivalent number of epochs and tested learning
performance by letting the trained programs play against a
near-perfect opponent. Through this, we found that the reinforcement
learning model that uses an intuition neural network for placing the
paddle during reward training quickly overtakes the simple
architecture in its ability to outplay the near-perfect opponent,
additionally outscoring that opponent by an increasingly wide margin

Planning plays an important role in the broad class of decision
theory. Planning has drawn much attention in recent work in the
robotics and sequential decision making areas. Recently, Reinforcement
Learning (RL), as an agent-environment interaction problem, has
brought further attention to planning methods. Generally in RL, one
can assume a generative model, e.g. graphical models, for the
environment, and then the task for the RL agent is to learn the model
parameters and find the optimal strategy based on these learnt
parameters. Based on environment behavior, the agent can assume
various types of generative models, e.g. Multi Armed Bandit for a
static environment, or Markov Decision Process (MDP) for a dynamic
environment. The advantage of these popular models is their
simplicity, which results in tractable methods of learning the
parameters and finding the optimal policy. The drawback of these
models is again their simplicity: these models usually underfit and
underestimate the actual environment behavior. For example, in
robotics, the agent usually has noisy observations of the environment
inner state and MDP is not a suitable model. More complex models like
Partially Observable Markov Decision Process (POMDP) can compensate
for this drawback. Fitting this model to the environment, where the
partial observation is given to the agent, generally gives dramatic
performance improvement, sometimes unbounded improvement, compared to
MDP. In general, finding the optimal policy for the POMDP model is
computationally intractable and fully non convex, even for the class
of memoryless policies. The open problem is to come up with a method
to find an exact or an approximate optimal stochastic memoryless
policy for POMDP models.

It’s not that complicated, right?

• [cs.AI]Model Selection with Nonlinear Embedding for Unsupervised

Hemanth Venkateswara, Shayok Chakraborty, Troy McDaniel, Sethuraman
Panchanathan

http://arxiv.org/abs/1706.07527v1

• [cs.CL]AMR-to-text generation as a Traveling Salesman Problem
Linfeng Song, Yue Zhang, Xiaochang Peng, Zhiguo Wang, Daniel Gildea
http://arxiv.org/abs/1609.07451v1

• [cs.AI]Practical optimal experiment design with probabilistic
programs

Long Ouyang, Michael Henry Tessler, Daniel Ly, Noah Goodman
http://arxiv.org/abs/1608.05046v1

The idea for building a “latticework” of mental models comes from
Charlie Munger, Vice Chairman of Berkshire Hathaway and one of the
finest thinkers in the world.

a source distribution, to work effectively on data from a target
distribution. In this paper, we introduce the Nonlinear Embedding
Transform (NET) for unsupervised domain adaptation. The NET reduces
cross-domain disparity through nonlinear domain alignment. It also
embeds the domain-aligned data such that similar data points are
clustered together. This results in enhanced classification. To
determine the parameters in the NET model (and in other unsupervised
domain adaptation models), we introduce a validation procedure by
sampling source data points that are similar in distribution to the
target data. We test the NET and the validation procedure using
popular image datasets and compare the classification results across
competitive procedures for unsupervised domain adaptation.

The task of AMR-to-text generation is to generate grammatical text
that sustains the semantic meaning for a given AMR graph. We at- tack
the task by first partitioning the AMR graph into smaller fragments,
and then generating the translation for each fragment, before finally
deciding the order by solving an asymmetric generalized traveling
salesman problem (AGTSP). A Maximum Entropy classifier is trained to
estimate the traveling costs, and a TSP solver is used to find the
optimized solution. The final model reports a BLEU score of 22.44 on

Scientists often run experiments to distinguish competing theories.
This requires patience, rigor, and ingenuity – there is often a large
space of possible experiments one could run. But we need not comb this
space by hand – if we represent our theories as formal models and
explicitly declare the space of experiments, we can automate the
search for good experiments, looking for those with high expected
information gain. Here, we present a general and principled approach
to experiment design based on probabilistic programming languages
(PPLs). PPLs offer a clean separation between declaring problems and
solving them, which means that the scientist can automate experiment
design by simply declaring her model and experiment spaces in the PPL
without having to worry about the details of calculating information
gain. We demonstrate our system in two case studies drawn from
cognitive psychology, where we use it to design optimal experiments in
the domains of sequence prediction and categorization. We find strong
empirical validation that our automatically designed experiments were
indeed optimal. We conclude by discussing a number of interesting
questions for future research.

Munger’s system is akin to “cross-training for the mind.” Instead of
siloing ourselves in the small, limited areas we may have studied in
school, we study a broadly useful set of knowledge about the world,
which will serve us in all parts of life.

• [cs.AI]Toward Goal-Driven Neural Network Models for the Rodent
Whisker-Trigeminal System

Chengxu Zhuang, Jonas Kubilius, Mitra Hartmann, Daniel Yamins
http://arxiv.org/abs/1706.07555v1

• [cs.CL]Deep Multi-Task Learning with Shared Memory
Pengfei Liu, Xipeng Qiu, Xuanjing Huang
http://arxiv.org/abs/1609.07222v1

• [cs.CL]An Efficient Character-Level Neural Machine Translation
Shenjian Zhao, Zhihua Zhang
http://arxiv.org/abs/1608.04738v1

In a famous speech in the 1990s, Munger explained his novel approach to
gaining practical wisdom:

In large part, rodents see the world through their whiskers, a
powerful tactile sense enabled by a series of brain areas that form
the whisker-trigeminal system. Raw sensory data arrives in the form of
mechanical input to the exquisitely sensitive, actively-controllable
whisker array, and is processed through a sequence of neural circuits,
eventually arriving in cortical regions that communicate with
decision-making and memory areas. Although a long history of
experimental studies has characterized many aspects of these
processing stages, the computational operations of the
whisker-trigeminal system remain largely unknown. In the present work,
we take a goal-driven deep neural network (DNN) approach to modeling
these computations. First, we construct a biophysically-realistic
model of the rat whisker array. We then generate a large dataset of
whisker sweeps across a wide variety of 3D objects in highly-varying
poses, angles, and speeds. Next, we train DNNs from several distinct
architectural families to solve a shape recognition task in this
dataset. Each architectural family represents a structurally-distinct
hypothesis for processing in the whisker-trigeminal system,
corresponding to different ways in which spatial and temporal
information can be integrated. We find that most networks perform
poorly on the challenging shape recognition task, but that specific
architectures from several families can achieve reasonable performance
levels. Finally, we show that Representational Dissimilarity Matrices
(RDMs), a tool for comparing population codes between neural systems,
can separate these higher-performing networks with data of a type that
could plausibly be collected in a neurophysiological or imaging
experiment. Our results are a proof-of-concept that goal-driven DNN
networks of the whisker-trigeminal system are potentially within
reach.

Neural network based models have achieved impressive results on
various specific tasks. However, in previous works, most models are
learned separately based on single-task supervised objectives, which
often suffer from insufficient training data. In this paper, we
propose two deep architectures which can be trained jointly on
multiple related tasks. More specifically, we augment neural model
with an external memory, which is shared by several tasks. Experiments
on two groups of text classification tasks show that our proposed
architectures can improve the performance of a task with the help of

Neural machine translation aims at building a single large neural
network that can be trained to maximize translation performance. The
encoder-decoder architecture with an attention mechanism achieves a
translation performance comparable to the existing state-of-the-art
phrase-based systems on the task of English-to-French translation.
However, the use of large vocabulary becomes the bottleneck in both
training and improving the performance. In this paper, we propose an
efficient architecture to train a deep character-level neural machine
translation by introducing a decimator and an interpolator. The
decimator is used to sample the source sequence before encoding while
the interpolator is used to resample after decoding. Such a deep model
has two major advantages. It avoids the large vocabulary issue
radically; at the same time, it is much faster and more
memory-efficient in training than conventional character-based models.
More interestingly, our model is able to translate the misspelled word
like human beings.

Well, the first rule is that you can’t really know anything if you just
remember isolated facts and try and bang ’em back. If the facts don’t
hang together on a latticework of theory, you don’t have them in a
usable form.

• [cs.CL]Comparison of Modified Kneser-Ney and Witten-Bell Smoothing
Techniques in Statistical Language Model of Bahasa Indonesia

Ismail Rusli
http://arxiv.org/abs/1706.07786v1

• [cs.CL]Incorporating Relation Paths in Neural Relation
Extraction

Wenyuan Zeng, Yankai Lin, Zhiyuan Liu, Maosong Sun
http://arxiv.org/abs/1609.07479v1

• [cs.CL]Cohesion and Coalition Formation in the European

Darko Cherepnalkoski, Andreas Karpf, Igor Mozetic, Miha Grcar
http://arxiv.org/abs/1608.04917v1

You’ve got to have models in your head. And you’ve got to array your
experience both vicarious and direct on this latticework of models. You
may have noticed students who just try to remember and pound back what
is remembered. Well, they fail in school and in life. You’ve got to hang

Smoothing is one technique to overcome data sparsity in statistical
language model. Although in its mathematical definition there is no
explicit dependency upon specific natural language, different natures
of natural languages result in different effects of smoothing
techniques. This is true for Russian language as shown by Whittaker
(1998). In this paper, We compared Modified Kneser-Ney and Witten-Bell
smoothing techniques in statistical language model of Bahasa
Indonesia. We used train sets of totally 22M words that we extracted
from Indonesian version of Wikipedia. As far as we know, this is the
largest train set used to build statistical language model for Bahasa
Indonesia. The experiments with 3-gram, 5-gram, and 7-gram showed that
Modified Kneser-Ney consistently outperforms Witten-Bell smoothing
technique in term of perplexity values. It is interesting to note that
our experiments showed 5-gram model for Modified Kneser-Ney smoothing
technique outperforms that of 7-gram. Meanwhile, Witten-Bell smoothing
is consistently improving over the increase of n-gram order.

Distantly supervised relation extraction has been widely used to find
novel relational facts from plain text. To predict the relation
between a pair of two target entities, existing methods solely rely on
those direct sentences containing both entities. In fact, there are
also many sentences containing only one of the target entities, which
provide rich and useful information for relation extraction. To
address this issue, we build inference chains between two target
entities via intermediate entities, and propose a path-based neural
relation extraction model to encode the relational semantics from both
direct sentences and inference chains. Experimental results on
real-world datasets show that, our model can make full use of those
sentences containing only one target entity, and achieves significant
and consistent improvements on relation extraction as compared with
baselines.

We study the cohesion within and the coalitions between political
groups in the Eighth European Parliament (2014–2019) by analyzing two
entirely different aspects of the behavior of the Members of the
European Parliament (MEPs) in the policy-making processes. On one
hand, we analyze their co-voting patterns and, on the other, their
retweeting behavior. We make use of two diverse datasets in the
analysis. The first one is the roll-call vote dataset, where cohesion
is regarded as the tendency to co-vote within a group, and a coalition
is formed when the members of several groups exhibit a high degree of
co-voting agreement on a subject. The second dataset comes from
Twitter; it captures the retweeting (i.e., endorsing) behavior of the
MEPs and implies cohesion (retweets within the same group) and
coalitions (retweets between groups) from a completely different
perspective. We employ two different methodologies to analyze the
cohesion and coalitions. The first one is based on Krippendorff’s
Alpha reliability, used to measure the agreement between raters in
data-analysis scenarios, and the second one is based on Exponential
Random Graph Models, often used in social-network analysis. We give
general insights into the cohesion of political groups in the European
Parliament, explore whether coalitions are formed in the same way for
different policy areas, and examine to what degree the retweeting
behavior of MEPs corresponds to their co-voting patterns. A novel and
interesting aspect of our work is the relationship between the
co-voting and retweeting patterns.

What are the models? Well, the first rule is that you’ve got to have
multiple models because if you just have one or two that you’re using,
the nature of human psychology is such that you’ll torture reality so
that it fits your models, or at least you’ll think it does. …

• [cs.CL]End-to-end Conversation Modeling Track in DSTC6
Chiori Hori, Takaaki Hori
http://arxiv.org/abs/1706.07440v1

• [cs.CL]Language as a Latent Variable: Discrete Generative Models
for Sentence Compression

Yishu Miao, Phil Blunsom
http://arxiv.org/abs/1609.07317v1

• [cs.CL]Ensemble of Jointly Trained Deep Neural Network-Based
Acoustic Models for Reverberant Speech Recognition

Jeehye Lee, Myungin Lee, Joon-Hyuk Chang
http://arxiv.org/abs/1608.04983v1

And the models have to come from multiple disciplines because all the
wisdom of the world is not to be found in one little academic
department. That’s why poetry professors, by and large, are so unwise in
a worldly sense. They don’t have enough models in their heads. So you’ve
got to have models across a fair array of disciplines.

End-to-end training of neural networks is a promising approach to
automatic construction of dialog systems using a human-to-human dialog
corpus. Recently, Vinyals et al. tested neural conversation models
using OpenSubtitles. Lowe et al. released the Ubuntu Dialogue Corpus
for researching unstructured multi-turn dialogue systems. Furthermore,
the approach has been extended to accomplish task oriented dialogs to
provide information properly with natural conversation. For example,
Ghazvininejad et al. proposed a knowledge grounded neural conversation
model [3], where the research is aiming at combining conversational
dialogs with task-oriented knowledge using unstructured data such as
Twitter data for conversation and Foursquare data for external
knowledge.However, the task is still limited to a restaurant
information service, and has not yet been tested with a wide variety
of dialog tasks. In addition, it is still unclear how to create
intelligent dialog systems that can respond like a human agent. In
consideration of these problems, we proposed a challenge track to the
6th dialog system technology challenges (DSTC6) using human-to-human
dialog data to mimic human dialog behaviors. The focus of the
challenge track is to train end-to-end conversation models from
human-to-human conversation and accomplish end-to-end dialog tasks in
various situations assuming a customer service, in which a system
plays a role of human agent and generates natural and informative
sentences in response to user’s questions or comments given dialog
context.

In this work we explore deep generative models of text in which the
latent representation of a document is itself drawn from a discrete
language model distribution. We formulate a variational auto-encoder
for inference in this model and apply it to the task of compressing
sentences. In this application the generative model first draws a
latent summary sentence from a background language model, and then
subsequently draws the observed sentence conditioned on this latent
summary. In our empirical evaluation we show that generative
formulations of both abstractive and extractive compression yield
state-of-the-art results when trained on a large amount of supervised
data. Further, we explore semi-supervised compression scenarios where
we show that it is possible to achieve performance competitive with
previously proposed supervised models while training on a fraction of
the supervised data.

Distant speech recognition is a challenge, particularly due to the
corruption of speech signals by reverberation caused by large
distances between the speaker and microphone. In order to cope with a
wide range of reverberations in real-world situations, we present
novel approaches for acoustic modeling including an ensemble of deep
neural networks (DNNs) and an ensemble of jointly trained DNNs. First,
multiple DNNs are established, each of which corresponds to a
different reverberation time 60 (RT60) in a setup step. Also, each
model in the ensemble of DNN acoustic models is further jointly
trained, including both feature mapping and acoustic modeling, where
the feature mapping is designed for the dereverberation as a
front-end. In a testing phase, the two most likely DNNs are chosen
from the DNN ensemble using maximum a posteriori (MAP) probabilities,
computed in an online fashion by using maximum likelihood (ML)-based
blind RT60 estimation and then the posterior probability outputs from
two DNNs are combined using the ML-based weights as a simple average.
Extensive experiments demonstrate that the proposed approach leads to
substantial improvements in speech recognition accuracy over the
conventional DNN baseline systems under diverse reverberant
conditions.

You may say, “My God, this is already getting way too tough.” But,
fortunately, it isn’t that tough because 80 or 90 important models will
carry about 90% of the freight in making you a worldly wise person. And,
of those, only a mere handful really carry very heavy freight.(1)

• [cs.CL]Named Entity Recognition with stack residual LSTM and
trainable bias decoding

Quan Tran, Andrew MacKinlay, Antonio Jimeno Yepes
http://arxiv.org/abs/1706.07598v1

• [cs.CR]Building accurate HAV exploiting User Profiling and
Sentiment Analysis

Alan Ferrari, Angelo Consoli
http://arxiv.org/abs/1609.07302v1

• [cs.CL]Proceedings of the LexSem+Logics Workshop 2016
Steven Neale, Valeria de Paiva, Arantxa Otegi, Alexandre Rademaker
http://arxiv.org/abs/1608.04767v1

Taking Munger’s concept as our starting point, we can figure out how to
use our brains more effectively by building our own latticework of
mental models.

Recurrent Neural Net models are the state-of-the-art for Named Entity
Recognition (NER). We present two innovations to improve the
performance of these models. The first innovation is the introduction
of residual connections between the Stacked Recurrent Neural Network
second innovation is a bias decoding mechanism that allows the trained
system to adapt to a non-differentiable and externally computed
objective, such as the F-measure, to address the limitations of
traditional loss functions that optimize for accuracy. Our work
improves the state-of-the-art results for both Spanish and English
languages on the standard train/development/test split of the CoNLL

Social Engineering (SE) is one of the most dangerous aspect an
attacker can use against a given entity (private citizen, industry,
government, …). In order to perform SE attacks, it is necessary to
collect as much information as possible about the target (or
victim(s)). The aim of this paper is to report the details of an
activity which took to the development of an automatic tool that
extracts, categorizes and summarizes the target interests, thus
possible weaknesses with respect to specific topics. Data is collected
from the user’s activity on social networks, parsed and analyzed using
text mining techniques. The main contribution of the proposed tool
consists in delivering some reports that allow the citizen,
institutions as well as private bodies the screening of their exposure
to SE attacks, with a strong awareness potential that will be
reflected in a decrease of the risks and a good opportunity to save
money.

Lexical semantics continues to play an important role in driving
research directions in NLP, with the recognition and understanding of
context becoming increasingly important in delivering successful
word sense and named entity disambiguation, the creation and
maintenance of dictionaries, annotated corpora and resources have
become cornerstones of lexical semantics research and produced a
wealth of contextual information that NLP processes can exploit. New
efforts both to link and construct from scratch such information – as
Linked Open Data or by way of formal tools coming from logic,
ontologies and automated reasoning – have increased the
interoperability and accessibility of resources for lexical and
computational semantics, even in those languages for which they have
previously been limited. LexSem+Logics 2016 combines the 1st Workshop
on Lexical Semantics for Lesser-Resources Languages and the 3rd
Workshop on Logics and Ontologies. The accepted papers in our program
covered topics across these two areas, including: the encoding of
plurals in Wordnets, the creation of a thesaurus from multiple sources
based on semantic similarity metrics, and the use of cross-lingual
treebanks and annotations for universal part-of-speech tagging. We
also welcomed talks from two distinguished speakers: on Portuguese
lexical knowledge bases (different approaches, results and their
application in NLP tasks) and on new strategies for open information
extraction (the capture of verb-based propositions from massive text
corpora).

Building the Latticework

• [cs.CL]Neural Machine Translation with Gumbel-Greedy Decoding
Jiatao Gu, Daniel Jiwoong Im, Victor O. K. Li
http://arxiv.org/abs/1706.07518v1

• [cs.CV]EFANNA : An Extremely Fast Approximate Nearest Neighbor
Search Algorithm Based on kNN Graph

Cong Fu, Deng Cai
http://arxiv.org/abs/1609.07228v1

• [cs.CL]The Roles of Path-based and Distributional Information in
Recognizing Lexical Semantic Relations

Vered Shwartz, Ido Dagan
http://arxiv.org/abs/1608.05014v1

compete,
or die.

The central principle of the mental-models approach is that you must
have a large number of them, and they must be fundamentally lasting
ideas.

Previous neural machine translation models used some heuristic search
algorithms (e.g., beam search) in order to avoid solving the maximum a
posteriori problem over translation sentences at test time. In this
paper, we propose the Gumbel-Greedy Decoding which trains a generative
network to predict translation under a trained model. We solve such a
problem using the Gumbel-Softmax reparameterization, which makes our
generative network differentiable and trainable through standard
stochastic gradient methods. We empirically demonstrate that our
proposed model is effective for generating sequences of discrete
words.

Approximate nearest neighbor (ANN) search is a fundamental problem in
many areas of data mining, machine learning and computer vision. The
performance of traditional hierarchical structure (tree) based methods
decreases as the dimensionality of data grows, while hashing based
methods usually lack efficiency in practice. Recently, the graph based
methods have drawn considerable attention. The main idea is that
\emph{a neighbor of a neighbor is also likely to be a neighbor},
which we refer as \emph{NN-expansion}. These methods construct a
$k$-nearest neighbor ($k$NN) graph offline. And at online search
stage, these methods find candidate neighbors of a query point in some
way (\eg, random selection), and then check the neighbors of these
candidate neighbors for closer ones iteratively. Despite some
promising results, there are mainly two problems with these
approaches: 1) These approaches tend to converge to local optima. 2)
Constructing a $k$NN graph is time consuming. We find that these two
problems can be nicely solved when we provide a good initialization
for NN-expansion. In this paper, we propose EFANNA, an extremely fast
approximate nearest neighbor search algorithm based on $k$NN Graph.
Efanna nicely combines the advantages of hierarchical structure based
methods and nearest-neighbor-graph based methods. Extensive
experiments have shown that EFANNA outperforms the state-of-art
algorithms both on approximate nearest neighbor search and approximate
nearest neighbor graph construction. To the best of our knowledge,
EFANNA is the fastest algorithm so far both on approximate nearest
neighbor graph construction and approximate nearest neighbor search. A
library EFANNA based on this research is released on Github.

Recognizing various semantic relations between terms is crucial for
many NLP tasks. While path-based and distributional information
sources are considered complementary, the strong results the latter
showed on recent datasets suggested that the former’s contribution
might have become obsolete. We follow the recent success of an
integrated neural method for hypernymy detection (Shwartz et al.,
2016) and extend it to recognize multiple relations. We demonstrate
that these two information sources are indeed complementary, and
analyze the contributions of each source.

regression。

As with physical tools, the lack of a mental tool at a crucial moment
can lead to a bad result, and the use of a wrong mental tool is even
worse.

• [cs.CL]Personalization in Goal-Oriented Dialog
Chaitanya K. Joshi, Fei Mi, Boi Faltings
http://arxiv.org/abs/1706.07503v1

• [cs.CV]EgoCap: Egocentric Marker-less Motion Capture with Two
Fisheye Cameras

Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutdinov,
Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, Christian
Theobalt

http://arxiv.org/abs/1609.07306v1

• [cs.CV]An image compression and encryption scheme based on deep
learning

Fei Hu, Changjiu Pu, Haowei Gao, Mengzi Tang, Li Li
http://arxiv.org/abs/1608.05001v1

logistic regression 的传送门：

If this seems self-evident, it’s actually a very unnatural way to think.
Without the right training, most minds take the wrong approach. They
prefer to solve problems by asking: Which ideas do I already love and
know deeply, and how can I apply them to the situation at hand?
Psychologists call this tendency the “Availability Heuristic” and its
power is well documented.

The main goal of modelling human conversation is to create agents
which can interact with people in both open-ended and goal-oriented
scenarios. End-to-end trained neural dialog systems are an important
line of research for such generalized dialog models as they do not
resort to any situation-specific handcrafting of rules. Modelling
personalization of conversation in such agents is important for them
to be truly ‘smart’ and to integrate seamlessly into the lives of
human beings. However, the topic has been largely unexplored by
researchers as there are no existing corpora for training dialog
systems on conversations that are influenced by the profiles of the
speakers involved. In this paper, we present a new dataset of
goal-oriented dialogs with profiles attached to them. We also
introduce a framework for analyzing how systems model personalization
Although no existing model was able to sufficiently solve our tasks,
we provide baselines using a variety of learning methods and
investigate in detail the shortcomings of an end-to-end dialog system
based on Memory Networks. Our dataset and experimental code are
publicly available at
https://github.com/chaitjo/personalized-dialog

Marker-based and marker-less optical skeletal motion-capture methods
use an outside-in arrangement of cameras placed around a scene, with
viewpoints converging on the center. They often create discomfort by
possibly needed marker suits, and their recording volume is severely
restricted and often constrained to indoor scenes with controlled
backgrounds. Alternative suit-based systems use several inertial
measurement units or an exoskeleton to capture motion. This makes
capturing independent of a confined volume, but requires substantial,
often constraining, and hard to set up body instrumentation. We
therefore propose a new method for real-time, marker-less and
egocentric motion capture which estimates the full-body skeleton pose
from a lightweight stereo pair of fisheye cameras that are attached to
a helmet or virtual reality headset. It combines the strength of a new
generative pose estimation framework for fisheye views with a
ConvNet-based body-part detector trained on a large new dataset. Our
inside-in method captures full-body motion in general indoor and
outdoor scenes, and also crowded scenes with many people in close
vicinity. The captured user can freely move around, which enables
reconstruction of larger-scale activities and is particularly useful
in virtual reality to freely roam and interact, while seeing the fully
motion-captured virtual body.

Stacked Auto-Encoder (SAE) is a kind of deep learning algorithm for
unsupervised learning. Which has multi layers that project the vector
representation of input data into a lower vector space. These
projection vectors are dense representations of the input data. As a
result, SAE can be used for image compression. Using chaotic logistic
map, the compression ones can further be encrypted. In this study, an
application of image compression and encryption is suggested using SAE
and chaotic logistic map. Experiments show that this application is
feasible and effective. It can be used for image transmission and
image protection on internet simultaneously.

You know the adage “To the man with only a hammer, everything starts
looking like a nail.” Such narrow-minded thinking feels entirely natural
to us, but it leads to far too many misjudgments. You probably do it
every single day without knowing it.

• [cs.CR]Integrating self-efficacy into a gamified approach to
thwart phishing attacks

Nalin Asanka Gamagedara Arachchilage, Mumtaz Abdul Hameed
http://arxiv.org/abs/1706.07748v1

• [cs.CV]Example-Based Image Synthesis via Randomized
Patch-Matching

Yi Ren, Yaniv Romano, Michael Elad
http://arxiv.org/abs/1609.07370v1

• [cs.CV]Frame- and Segment-Level Features and Candidate Pool
Evaluation for Video Caption Generation

Rakshith Shetty, Jorma Laaksonen
http://arxiv.org/abs/1608.04959v1

If Success：

It’s not that you don’t have some good ideas in your head. You probably
do! No competent adult is a total klutz. It’s just that we tend to be
very limited in our good ideas, and we overuse them. This combination
makes our good ideas just as dangerous as bad ones!

Security exploits can include cyber threats such as computer programs
that can disturb the normal behavior of computer systems (viruses),
unsolicited e-mail (spam), malicious software (malware), monitoring
software (spyware), attempting to make computer resources unavailable
to their intended users (Distributed Denial-of-Service or DDoS
attack), the social engineering, and online identity theft (phishing).
One such cyber threat, which is particularly dangerous to computer
users is phishing. Phishing is well known as online identity theft,
which targets to steal victims’ sensitive information such as
designing an innovative and gamified approach to educate individuals
self-efficacy, which has a co-relation with the user’s knowledge, into
an anti-phishing educational game to thwart phishing attacks? One of
the main reasons would appear to be a lack of user knowledge to
prevent from phishing attacks. Therefore, this research investigates
the elements that influence (in this case, either conceptual or
procedural knowledge or their interaction effect) and then integrate
them into an anti-phishing educational game to enhance people’s
phishing prevention behaviour through their motivation.

Image and texture synthesis is a challenging task that has long been
drawing attention in the fields of image processing, graphics, and
machine learning. This problem consists of modelling the desired type
of images, either through training examples or via a parametric
modeling, and then generating images that belong to the same
focusing on two specific families of images — handwritten digits and
face images. This paper offers two main contributions. First, we
suggest a simple and intuitive algorithm capable of generating such
images in a unified way. The proposed approach taken is pyramidal,
consisting of upscaling and refining the estimated image several
times. For each upscaling stage, the algorithm randomly draws small
patches from a patch database, and merges these to form a coherent and
novel image with high visual quality. The second contribution is a
general framework for the evaluation of the generation performance,
which combines three aspects: the likelihood, the originality and the
spread of the synthesized images. We assess the proposed synthesis
scheme and show that the results are similar in nature, and yet
different from the ones found in the training set, suggesting that
true synthesis effect has been obtained.

We present our submission to the Microsoft Video to Language Challenge
of generating short captions describing videos in the challenge
dataset. Our model is based on the encoder–decoder pipeline, popular
in image and video captioning systems. We propose to utilize two
different kinds of video features, one to capture the video content in
terms of objects and attributes, and the other to capture the motion
and action information. Using these diverse features we train models
specializing in two separate input sub-domains. We then train an
evaluator model which is used to pick the best caption from the pool
of candidates generated by these domain expert models. We argue that
this approach is better suited for the current video captioning task,
compared to using a single model, due to the diversity in the dataset.
Efficacy of our method is proven by the fact that it was rated best in
MSR Video to Language Challenge, as per human evaluation.
Additionally, we were ranked second in the automatic evaluation
metrics based table.

The great investor and teacher Benjamin Graham explained it best:

• [cs.CV]Computer-aided implant design for the restoration of
cranial defects

Xiaojun Chen, Lu Xu, Xing Li, Jan Egger
http://arxiv.org/abs/1706.07649v1

• [cs.CV]Funnel-Structured Cascade for Multi-View Face Detection
with Alignment-Awareness

Shuzhe Wu, Meina Kan, Zhenliang He, Shiguang Shan, Xilin Chen
http://arxiv.org/abs/1609.07304v1

• [cs.CV]Geometry-aware Similarity Learning on SPD Manifolds for
Visual Recognition

Zhiwu Huang, Ruiping Wang, Xianqiu Li, Wenxian Liu, Shiguang Shan, Luc
Van Gool, Xilin Chen

http://arxiv.org/abs/1608.04914v1

You can get in way more trouble with a good idea than a bad idea,
because you forget that the good idea has limits.

Patient-specific cranial implants are important and necessary in the
surgery of cranial defect restoration. However, traditional methods of
manual design of cranial implants are complicated and time-consuming.
Our purpose is to develop a novel software named EasyCrania to design
the cranial implants conveniently and efficiently. The process can be
divided into five steps, which are mirroring model, clipping surface,
surface fitting, the generation of the initial implant and the
generation of the final implant. The main concept of our method is to
use the geometry information of the mirrored model as the base to
generate the final implant. The comparative studies demonstrated that
the EasyCrania can improve the efficiency of cranial implant design
significantly. And, the intra- and inter-rater reliability of the
software were stable, which were 87.07+/-1.6% and 87.73+/-1.4%
respectively.

Multi-view face detection in open environment is a challenging task
due to diverse variations of face appearances and shapes. Most
multi-view face detectors depend on multiple models and organize them
in parallel, pyramid or tree structure, which compromise between the
accuracy and time-cost. Aiming at a more favorable multi-view face
detector, we propose a novel funnel-structured cascade (FuSt)
detection framework. In a coarse-to-fine flavor, our FuSt consists of,
from top to bottom, 1) multiple view-specific fast LAB cascade for
extremely quick face proposal, 2) multiple coarse MLP cascade for
further candidate window verification, and 3) a unified fine MLP
cascade with shape-indexed features for accurate face detection.
Compared with other structures, on the one hand, the proposed one uses
multiple computationally efficient distributed classifiers to propose
a small number of candidate windows but with a high recall of
multi-view faces. On the other hand, by using a unified MLP cascade to
examine proposals of all views in a centralized style, it provides a
favorable solution for multi-view face detection with high accuracy
and low time-cost. Besides, the FuSt detector is alignment-aware and
performs a coarse facial part prediction which is beneficial for
subsequent face alignment. Extensive experiments on two challenging
datasets, FDDB and AFW, demonstrate the effectiveness of our FuSt
detector in both accuracy and speed.

Symmetric Positive Definite (SPD) matrices have been widely used for
data representation in many visual recognition tasks. The success
mainly attributes to learning discriminative SPD matrices with
encoding the Riemannian geometry of the underlying SPD manifold. In
this paper, we propose a geometry-aware SPD similarity learning
(SPDSL) framework to learn discriminative SPD features by directly
pursuing manifold-manifold transformation matrix of column full-rank.
Specifically, by exploiting the Riemannian geometry of the manifold of
fixed-rank Positive Semidefinite (PSD) matrices, we present a new
solution to reduce optimizing over the space of column full-rank
transformation matrices to optimizing on the PSD manifold which has a
well-established Riemannian structure. Under this solution, we exploit
a new supervised SPD similarity learning technique to learn the
transformation by regressing the similarities of selected SPD data
pairs to their ground-truth similarities on the target SPD manifold.
To optimize the proposed objective function, we further derive an
algorithm on the PSD manifold. Evaluations on three visual
the existing SPD-based discriminant learning methods.

Smart people like Charlie Munger realize that the antidote to this sort
of mental overreaching is to add more models to your mental palette — to
expand your repertoire of ideas, making them vivid and available in the
problem-solving process.

• [cs.CV]Joint Prediction of Depths, Normals and Surface Curvature
from RGB Images using CNNs

Thanuja Dharmasiri, Andrew Spek, Tom Drummond
http://arxiv.org/abs/1706.07593v1

• [cs.CV]Real-time Human Pose Estimation from Video with
Convolutional Neural Networks

Marko Linna, Juho Kannala, Esa Rahtu
http://arxiv.org/abs/1609.07420v1

• [cs.CV]Globally Variance-Constrained Sparse Representation for
Image Set Compression

Xiang Zhang, Jiarui Sun, Siwei Ma, Zhouchen Lin, Jian Zhang, Shiqi
Wang, Wen Gao

http://arxiv.org/abs/1608.04902v1

You’ll know you’re on to something when ideas start to compete with one
another — you’ll find situations where Model 1 tells you X and Model 2
tells you Y. Believe it or not, this is a sign that you’re on the right
track. Letting the models compete and fight for superiority and greater
fundamental truth is what good thinking is all about! It’s hard work,
but that’s the only way to get the right answers.

Understanding the 3D structure of a scene is of vital importance, when
it comes to developing fully autonomous robots. To this end, we
present a novel deep learning based framework that estimates depth,
surface normals and surface curvature by only using a single RGB
image. To the best of our knowledge this is the first work to estimate
surface curvature from colour using a machine learning approach.
Additionally, we demonstrate that by tuning the network to infer well
designed features, such as surface curvature, we can achieve improved
performance at estimating depth and normals.This indicates that
network guidance is still a useful aspect of designing and training a
neural network. We run extensive experiments where the network is
trained to infer different tasks while the model capacity is kept
constant resulting in different feature maps based on the tasks at
hand. We outperform the previous state-of-the-art benchmarks which
jointly estimate depths and surface normals while predicting surface
curvature in parallel.

In this paper, we present a method for real-time multi-person human
pose estimation from video by utilizing convolutional neural networks.
Our method is aimed for use case specific applications, where good
accuracy is essential and variation of the background and poses is
limited. This enables us to use a generic network architecture, which
is both accurate and fast. We divide the problem into two phases: (1)
pre-training and (2) finetuning. In pre-training, the network is
learned with highly diverse input data from publicly available
datasets, while in finetuning we train with application specific data,
which we record with Kinect. Our method differs from most of the
state-of-the-art methods in that we consider the whole system,
including person detector, pose estimator and an automatic way to
record application specific training material for finetuning. Our
method is considerably faster than many of the state-of-the-art
methods. Our method can be thought of as a replacement for Kinect, and
it can be used for higher level tasks, such as gesture control, games,
person tracking, action recognition and action tracking. We achieved
accuracy of 96.8% (PCK@0.2) with application specific data.

Sparse representation presents an efficient approach to approximately
recover a signal by the linear composition of a few bases from a
learnt dictionary, based on which various successful applications have
been observed. However, in the scenario of data compression, its
efficiency and popularity are hindered due to the extra overhead for
encoding the sparse coefficients. Therefore, how to establish an
accurate rate model in sparse coding and dictionary learning becomes
meaningful, which has been not fully exploited in the context of
sparse representation. According to the Shannon entropy inequality,
the variance of data source bounds its entropy, which can reflect the
actual coding bits. Hence, in this work a Globally
Variance-Constrained Sparse Representation (GVCSR) model is proposed,
where a variance-constrained rate model is introduced in the
optimization process. Specifically, we employ the Alternating
Direction Method of Multipliers (ADMM) to solve the non-convex
optimization problem for sparse coding and dictionary learning, both
of which have shown state-of-the-art performance in image
representation. Furthermore, we investigate the potential of GVCSR in
practical image set compression, where a common dictionary is trained
by several key images to represent the whole image set. Experimental
results have demonstrated significant performance improvements against
the most popular image codecs including JPEG and JPEG2000.

It’s a little like learning to walk or ride a bike; at first, you can’t
believe how much you’re supposed to do all at once, but eventually, you
wonder how you ever didn’t know how to do it.

• [cs.CV]Listen to Your Face: Inferring Facial Action Units from
Audio Channel

Zibo Meng, Shizhong Han, Yan Tong
http://arxiv.org/abs/1706.07536v1

• [cs.CV]The face-space duality hypothesis: a computational
model

Jonathan Vitale, Mary-Anne Williams, Benjamin Johnston
http://arxiv.org/abs/1609.07371v1

• [cs.CV]Large Angle based Skeleton Extraction for 3D Animation
Hugo Martin, Raphael Fernandez, Yong Khoo
http://arxiv.org/abs/1608.05045v1

As Charlie Munger likes to say, going back to any other method of
thinking would feel like cutting off your hands. Our experience confirms
the truth of Munger’s dictum.

Extensive efforts have been devoted to recognizing facial action units
(AUs). However, it is still challenging to recognize AUs from
spontaneous facial displays especially when they are accompanied with
speech. Different from all prior work that utilized visual
observations for facial AU recognition, this paper presents a novel
approach that recognizes speech-related AUs exclusively from audio
signals based on the fact that facial activities are highly correlated
with voice during speech. Specifically, dynamic and physiological
relationships between AUs and phonemes are modeled through a
continuous time Bayesian network (CTBN); then AU recognition is
performed by probabilistic inference via the CTBN model. A pilot
audiovisual AU-coded database has been constructed to evaluate the
proposed audio-based AU recognition framework. The database consists
of a “clean” subset with frontal and neutral faces and a challenging
subset collected with large head movements and occlusions.
Experimental results on this database show that the proposed CTBN
model achieves promising recognition performance for 7 speech-related
AUs and outperforms the state-of-the-art visual-based methods
especially for those AUs that are activated at low intensities or
“hardly visible” in the visual channel. Furthermore, the CTBN model
yields more impressive recognition performance on the challenging
subset, where the visual-based approaches suffer significantly.

Valentine’s face-space suggests that faces are represented in a
psychological multidimensional space according to their perceived
properties. However, the proposed framework was initially designed as
an account of invariant facial features only, and explanations for
dynamic features representation were neglected. In this paper we
propose, develop and evaluate a computational model for a twofold
structure of the face-space, able to unify both identity and
expression representations in a single implemented model. To capture
both invariant and dynamic facial features we introduce the face-space
duality hypothesis and subsequently validate it through a mathematical
presentation using a general approach to dimensionality reduction. Two
experiments with real facial images show that the proposed face-space:
(1) supports both identity and expression recognition, and (2) has a
twofold structure anticipated by our formal argument.

In this paper, we present a solution for arbitrary 3D character
deformation by investigating rotation angle of decomposition and
preserving the mesh topology structure. In computer graphics, skeleton
extraction and skeleton-driven animation is an active areas and gains
increasing interests from researchers. The accuracy is critical for
realistic animation and related applications. There have been
extensive studies on skeleton based 3D deformation. However for the
scenarios of large angle rotation of different body parts, it has been
relatively less addressed by the state-of-the-art, which often yield
unsatisfactory results. Besides 3D animation problems, we also notice
for many 3D skeleton detection or tracking applications from a video
or depth streams, large angle rotation is also a critical factor in
the regression accuracy and robustness. We introduced a distortion
metric function to quantify the surface curviness before and after
deformation, which is a major clue for large angle rotation detection.
The intensive experimental results show that our method is suitable
for 3D modeling, animation, skeleton based tracking applications.

• [cs.CV]Sampling Matters in Deep Embedding Learning
Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl
http://arxiv.org/abs/1706.07567v1

• [cs.CY]On the (im)possibility of fairness
Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian
http://arxiv.org/abs/1609.07236v1

• [cs.CY]Modelling Student Behavior using Granular Large Scale
Action Data from a MOOC

Steven Tang, Joshua C. Peterson, Zachary A. Pardos
http://arxiv.org/abs/1608.04789v1

If failure:

What kinds of knowledge are we talking about adding to our repertoire?

Deep embeddings answer one simple question: How similar are two
images? Learning these embeddings is the bedrock of verification,
zero-shot learning, and visual search. The most prominent approaches
optimize a deep convolutional network with a suitable loss function,
such as contrastive loss or triplet loss. While a rich line of work
focuses solely on the loss functions, we show in this paper that
selecting training examples plays an equally important role. We
a simple margin based loss is sufficient to outperform all other loss
functions. We evaluate our approach on the Stanford Online Products,
CAR196, and the CUB200-2011 datasets for image retrieval and
clustering, and on the LFW dataset for face verification. Our method
achieves state-of-the-art performance on all of them.

What does it mean for an algorithm to be fair? Different papers use
different notions of algorithmic fairness, and although these appear
internally consistent, they also seem mutually incompatible. We
present a mathematical setting in which the distinctions in previous
papers can be made formal. In addition to characterizing the spaces of
inputs (the “observed” space) and outputs (the “decision” space), we
introduce the notion of a construct space: a space that captures
unobservable, but meaningful variables for the prediction. We show
that in order to prove desirable properties of the entire
decision-making process, different mechanisms for fairness require
different assumptions about the nature of the mapping from construct
space to decision space. The results in this paper imply that future
treatments of algorithmic fairness should more explicitly state
assumptions about the relationship between constructs and
observations.

Digital learning environments generate a precise record of the actions
learners take as they interact with learning materials and complete
exercises towards comprehension. With this high quantity of sequential
data comes the potential to apply time series models to learn about
underlying behavioral patterns and trends that characterize successful
learning based on the granular record of student actions. There exist
several methods for looking at longitudinal, sequential data like
those recorded from learning environments. In the field of language
modelling, traditional n-gram techniques and modern recurrent neural
network (RNN) approaches have been applied to algorithmically find
structure in language and predict the next word given the previous
words in the sentence or paragraph as input. In this paper, we draw an
analogy to this work by treating student sequences of resource views
and interactions in a MOOC as the inputs and predicting students’ next
interaction as outputs. In this study, we train only on students who
received a certificate of completion. In doing so, the model could
potentially be used for recommendation of sequences eventually leading
to success, as opposed to perpetuating unproductive behavior. Given
that the MOOC used in our study had over 3,500 unique resources,
predicting the exact resource that a student will interact with next
might appear to be a difficult classification problem. We find that
simply following the syllabus (built-in structure of the course) gives
on average 23% accuracy in making this prediction, followed by the
n-gram method with 70.4%, and RNN based methods with 72.2%. This
research lays the ground work for recommendation in a MOOC and other
digital learning environments where high volumes of sequential data
exist.

It’s the big, basic ideas of all the truly fundamental academic
disciplines. The stuff you should have learned in the “101” course of
each major subject but probably didn’t. These are the true general
principles that underlie most of what’s going on in the world.

• [cs.CV]Training Adversarial Discriminators for Cross-channel
Abnormal Event Detection in Crowds

Mahdyar Ravanbakhsh, Enver Sangineto, Moin Nabi, Nicu Sebe
http://arxiv.org/abs/1706.07680v1

• [cs.CY]Tracking the Trackers: Towards Understanding the Mobile

Narseo Vallina-Rodriguez, Srikanth Sundaresan, Abbas Razaghpanah,
Rishab Nithyanand, Mark Allman, Christian Kreibich, Phillipa Gill

http://arxiv.org/abs/1609.07190v1

• [cs.DC]Safe Serializable Secure Scheduling: Transactions and the

Isaac Sheff, Tom Magrino, Jed Liu, Andrew C. Myers, Robbert van
Renesse

http://arxiv.org/abs/1608.04841v1

Things like: The main laws of physics. The main ideas driving chemistry.
The big, useful tools of mathematics. The guiding principles of biology.
The hugely useful concepts from human psychology. The central principles
of systems thinking. The working concepts behind business and markets.

Abnormal crowd behaviour detection attracts a large interest due to
its importance in video surveillance scenarios. However, the ambiguity
and the lack of sufficient “abnormal” ground truth data makes
end-to-end training of large deep networks hard in this domain. In
this paper we propose to use Generative Adversarial Nets (GANs), which
are trained to generate only the “normal” distribution of the data.
During the adversarial GAN training, a discriminator “D” is used as a
supervisor for the generator network “G” and vice versa. At testing
time we use “D” to solve our discriminative task (abnormality
detection), where “D” has been trained without the need of
manually-annotated abnormal data. Moreover, in order to prevent “G”
learn a trivial identity function, we use a cross-channel approach,
forcing “G” to transform raw-pixel data in motion information and vice
versa. The quantitative results on standard benchmarks show that our
method outperforms previous state-of-the-art methods in both the
frame-level and the pixel-level evaluation.

Third-party services form an integral part of the mobile ecosystem:
they allow app developers to add features such as performance
analytics and social network integration, and to monetize their apps
by enabling user tracking and targeted ad delivery. At present users,
researchers, and regulators all have at best limited understanding of
this third-party ecosystem. In this paper we seek to shrink this gap.
Using data from users of our ICSI Haystack app we gain a rich view of
the mobile ecosystem: we identify and characterize domains associated
with mobile advertising and user tracking, thereby taking an important
step towards greater transparency. We furthermore outline our steps
towards a public catalog and census of analytics services, their
behavior, their personal data collection processes, and their use
across mobile apps.

Modern applications often operate on data in multiple administrative
domains. In this federated setting, participants may not fully trust
each other. These distributed applications use transactions as a core
mechanism for ensuring reliability and consistency with persistent
data. However, the coordination mechanisms needed for transactions can
both leak confidential information and allow unauthorized influence.
By implementing a simple attack, we show these side channels can be
exploited. However, our focus is on preventing such attacks. We
explore secure scheduling of atomic, serializable transactions in a
federated setting. While we prove that no protocol can guarantee
security and liveness in all settings, we establish conditions for
sets of transactions that can safely complete under secure scheduling.
Based on these conditions, we introduce staged commit, a secure
scheduling protocol for federated transactions. This protocol avoids
insecure information channels by dividing transactions into distinct
stages. We implement a compiler that statically checks code to ensure
it meets our conditions, and a system that schedules these
transactions using the staged commit protocol. Experiments on this
implementation demonstrate that realistic federated transactions can
be scheduled securely, atomically, and efficiently.

These are the winning ideas. For all of the “bestselling” crap that is
touted as the new thing each year, there is almost certainly a bigger,
more fundamental, and more broadly applicable underlying idea that we
already knew about! The “new idea” is thus an application of old ideas,
packaged into a new format.

• [cs.CY]Computational Controversy
Benjamin Timmermans, Tobias Kuhn, Kaspar Beelen, Lora Aroyo
http://arxiv.org/abs/1706.07643v1

• [cs.DC]MPI Parallelization of the Resistive Wall Code STARWALL:
Report of the EUROfusion High Level Support Team Project JORSTAR

S. Mochalskyy, M. Hoelzl, R. Hatzky
http://arxiv.org/abs/1609.07441v1

• [cs.DC]The BioDynaMo Project: Creating a Platform for Large-Scale
Reproducible Biological Simulations

Lukas Breitwieser, Roman Bauer, Alberto Di Meglio, Leonard Johard,
Marcus Kaiser, Marco Manca, Manuel Mazzara, Fons Rademakers, Max
Talanov

http://arxiv.org/abs/1608.04967v1

Yet we tend to spend the majority of time keeping up with the “new” at
the expense of learning the “old”! This is truly nuts.

Climate change, vaccination, abortion, Trump: Many topics are
surrounded by fierce controversies. The nature of such heated debates
and their elements have been studied extensively in the social science
literature. More recently, various computational approaches to
controversy analysis have appeared, using new data sources such as
Wikipedia, which help us now better understand these phenomena.
However, compared to what social sciences have discovered about such
debates, the existing computational approaches mostly focus on just a
few of the many important aspects around the concept of controversies.
In order to link the two strands, we provide and evaluate here a
controversy model that is both, rooted in the findings of the social
science literature and at the same time strongly linked to
computational methods. We show how this model can lead to
computational controversy analytics that have full coverage over all
the crucial aspects that make up a controversy.

Large scale plasma instabilities inside a tokamak can be influenced by
the currents flowing in the conducting vessel wall. This involves non
linear plasma dynamics and its interaction with the wall current. In
order to study this problem the code that solves the
magneto-hydrodynamic (MHD) equations, called JOREK, was coupled with
the model for the vacuum region and the resistive conducting structure
named STARWALL. The JOREK-STARWALL model has been already applied to
perform simulations of the Vertical Displacement Events (VDEs), the
Resistive Wall Modes (RWMs), and Quiescent H-Mode. At the beginning of
the project it was not possible to resolve the realistic wall
structure with a large number of finite element triangles due to the
huge consumption of memory and wall clock time by STARWALL and the
corresponding coupling routine in JOREK. Moreover, both the STARWALL
code and the JOREK coupling routine are only partially parallelized
via OpenMP. The aim of this project is to implement an MPI
parallelization in the model that should allow to obtain realistic
results with high resolution. This project concentrates on the MPI
parallelization of STARWALL. Parallel I/O and the MPI parallelization
of the coupling terms inside JOREK will be addressed in a follow-up
project.

Computer simulations have become a very powerful tool for scientific
research. In order to facilitate research in computational biology,
the BioDynaMo project aims at a general platform for biological
computer simulations, which should be executable on hybrid cloud
computing systems. This paper describes challenges and lessons learnt
during the early stages of the software development process, in the
context of implementation issues and the international nature of the
collaboration.

The mental-models approach inverts the process to the way it should be:
learning the Big Stuff deeply and then using that powerful database
every single day.

• [cs.CY]Human decisions in moral dilemmas are largely described by
Utilitarianism: virtual car driving study provides guidelines for

Maximilian Alexander Wächter, Anja Faulhaber, Felix Blind, Silja Timm,
Anke Dittmer, Leon René Sütfeld, Achim Stephan, Gordon Pipa, Peter
König

http://arxiv.org/abs/1706.07332v2

• [cs.DL]OCR++: A Robust Framework For Information Extraction from
Scholarly Articles

Mayank Singh, Barnopriyo Barua, Priyank Palod, Manvi Garg, Sidhartha
Satapathy, Samuel Bushi, Kumar Ayush, Krishna Sai Rohith, Tulasi Gamidi,
Pawan Goyal, Animesh Mukherjee

http://arxiv.org/abs/1609.06423v3

• [cs.DL]Anomalies in the peer-review system: A case study of the
journal of High Energy Physics

Sandipan Sikdar, Matteo Marsili, Niloy Ganguly, Animesh Mukherjee
http://arxiv.org/abs/1608.04875v1

The overarching goal is to build a powerful “tree” of the mind with
strong and deep roots, a massive trunk, and lots of sturdy branches. We
use this tree to hang the “leaves” of experience we acquire, directly
and vicariously, throughout our lifetimes: the scenarios, decisions,
problems, and solutions arising in any human life.

Ethical thought experiments such as the trolley dilemma have been
investigated extensively in the past, showing that humans act in a
utilitarian way, trying to cause as little overall damage as possible.
These trolley dilemmas have gained renewed attention over the past
years; especially due to the necessity of implementing moral decisions
in autonomous driving vehicles. We conducted a set of experiments in
which participants experienced modified trolley dilemmas as the driver
in a virtual reality environment. Participants had to make
decisionsbetween two discrete options: driving on one of two lanes
where different obstacles came into view. Obstacles included a variety
of human-like avatars of different ages and group sizes. Furthermore,
we tested the influence of a sidewalk as a potential safe harbor and a
condition implicating a self-sacrifice. Results showed that subjects,
in general, decided in a utilitarian manner, sparing the highest
number of avatars possible with a limited influence of the other
variables. Our findings support that human behavior is in line with
the utilitarian approach to moral decision making. This may serve as a
guideline for the implementation of moral decisions in ADVs.

This paper proposes OCR++, an open-source framework designed for a
variety of information extraction tasks from scholarly articles
including metadata (title, author names, affiliation and e-mail),
URLs and footnotes) and bibliography (citation instances and
references). We analyze a diverse set of scientific articles written
in English language to understand generic writing patterns and
formulate rules to develop this hybrid framework. Extensive
evaluations show that the proposed framework outperforms the existing
state-of-the-art tools with huge margin in structural information
extraction along with improved performance in metadata and
bibliography extraction tasks, both in terms of accuracy (around 50%
improvement) and processing time (around 52% improvement). A user
experience study conducted with the help of 30 researchers reveals
that the researchers found this system to be very helpful. As an
additional objective, we discuss two novel use cases including
automatically extracting links to public datasets from the
proceedings, which would further accelerate the advancement in digital
libraries. The result of the framework can be exported as a whole into
structured TEI-encoded documents. Our framework is accessible online
at
http://cnergres.iitkgp.ac.in/OCR++/home/.

Peer-review system has long been relied upon for bringing quality
research to the notice of the scientific community and also preventing
flawed research from entering into the literature. The need for the
peer-review system has often been debated as in numerous cases it has
failed in its task and in most of these cases editors and the
reviewers were thought to be responsible for not being able to
correctly judge the quality of the work. This raises a question “Can
the peer-review system be improved?” Since editors and reviewers are
the most important pillars of a reviewing system, we in this work,
attempt to address a related question – given the editing/reviewing
history of the editors or re- viewers “can we identify the
under-performing ones?”, with citations received by the
edited/reviewed papers being used as proxy for quantifying
performance. We term such review- ers and editors as anomalous and we
believe identifying and removing them shall improve the performance of
the peer- review system. Using a massive dataset of Journal of High
Energy Physics (JHEP) consisting of 29k papers submitted between 1997
and 2015 with 95 editors and 4035 reviewers and their review history,
we identify several factors which point to anomalous behavior of
referees and editors. In fact the anomalous editors and reviewers
account for 26.8% and 14.5% of the total editors and reviewers
respectively and for most of these anomalous reviewers the performance

Now, let’s start by summarizing the models we’ve found useful. To
explore them in more depth, click the links provided below.

• [cs.CY]Mediated behavioural change in human-machine networks:
exploring network characteristics, trust and motivation

Paul Walland, J. Brian Pickering
http://arxiv.org/abs/1706.07597v1

• [cs.DS]Scheduling Under Power and Energy Constraints
Mohammed Haroon Dupty, Pragati Agrawal, Shrisha Rao
http://arxiv.org/abs/1609.07354v1

• [cs.DS]Faster Sublinear Algorithms using Conditional Sampling
Themistoklis Gouleakis, Christos Tzamos, Manolis Zampetakis
http://arxiv.org/abs/1608.04759v1

And remember: Building your latticework is a lifelong project. Stick
with it, and you’ll find that your ability to understand reality, make
consistently good decisions, and help those you love will always be
improving.

Human-machine networks pervade much of contemporary life. Network
change is the product of structural modifications along with
differences in participant be-havior. If we assume that behavioural
change in a human-machine network is the result of changing the
attitudes of participants in the network, then the question arises
whether network structure can affect participant attitude. Taking
citizen par-ticipation as an example, engagement with relevant
stakeholders reveals trust and motivation to be the major objectives
for the network. Using a typology to de-scribe network state based on
multiple characteristic or dimensions, we can pre-dict possible
behavioural outcomes in the network. However, this has to be medi-ated
via attitude change. Motivation for the citizen participation network
can only increase in line with enhanced trust. The focus for changing
network dynamics, therefore, shifts to the dimensional changes needed
to encourage increased trust. It turns out that the coordinated
manipulation of multiple dimensions is needed to bring about the
desired shift in attitude.

Given a system model where machines have distinct speeds and power
ratings but are otherwise compatible, we consider various problems of
scheduling under resource constraints on the system which place the
restriction that not all machines can be run at once. These can be
power, energy, or makespan constraints on the system. Given such
constraints, there are problems with divisible as well as
non-divisible jobs. In the setting where there is a constraint on
power, we show that the problem of minimizing makespan for a set of
divisible jobs is NP-hard by reduction to the knapsack problem. We
then show that scheduling to minimize energy with power constraints is
also NP-hard. We then consider scheduling with energy and makespan
constraints with divisible jobs and show that these can be solved in
polynomial time, and the problems with non-divisible jobs are NP-hard.
We give exact and approximation algorithms for these problems as
required.

A conditional sampling oracle for a probability distribution D returns
samples from the conditional distribution of D restricted to a
specified subset of the domain. A recent line of work (Chakraborty et
al. 2013 and Cannone et al. 2014) has shown that having access to such
a conditional sampling oracle requires only polylogarithmic or even
constant number of samples to solve distribution testing problems like
identity and uniformity. This significantly improves over the standard
sampling model where polynomially many samples are necessary. Inspired
by these results, we introduce a computational model based on
conditional sampling to develop sublinear algorithms with
exponentially faster runtimes compared to standard sublinear
algorithms. We focus on geometric optimization problems over points in
via a conditional sampling oracle that takes as input a succinct
representation of a subset of the domain and outputs a uniformly
random point in that subset. We study two well studied problems:
k-means clustering and estimating the weight of the minimum spanning
tree. In contrast to prior algorithms for the classic model, our
algorithms have time, space and sample complexity that is polynomial
in the dimension and polylogarithmic in the number of points. Finally,
we comment on the applicability of the model and compare with existing
ones like streaming, parallel and distributed computational models.

The Farnam Street Latticework of Mental Models

• [cs.DC]Heterogeneous MPSoCs for Mixed Criticality Systems:
Challenges and Opportunities

Mohamed Hassan
http://arxiv.org/abs/1706.07429v1

• [cs.LG]A Novel Progressive Multi-label Classifier for
Classincremental Data

Mihika Dave, Sahil Tapiawala, Meng Joo Er, Rajasekar Venkatesan
http://arxiv.org/abs/1609.07215v1

• [cs.DS]Lecture Notes on Spectral Graph Methods
Michael W. Mahoney
http://arxiv.org/abs/1608.04845v1

Mental Models — How to Solve Problems

Due to their cost, performance, area, and energy efficiency, MPSoCs
offer appealing architecture for emerging mixed criticality systems
(MCS) such as driverless cars, smart power grids, and healthcare
devices. Furthermore, heterogeneity of MPSoCs presents exceptional
opportunities to satisfy the conflicting requirements of MCS. Seizing
these opportunities is unattainable without addressing the associated
challenges. We focus on four aspects of MCS that we believe are of
most importance upon adopting MPSoCs: theoretical model, interference,
data sharing, and security. We outline existing solutions, highlight
the necessary considerations for MPSoCs including both opportunities
they create and research directions yet to be explored.

In this paper, a progressive learning algorithm for multi-label
classification to learn new labels while retaining the knowledge of
previous labels is designed. New output neurons corresponding to new
labels are added and the neural network connections and parameters are
automatically restructured as if the label has been introduced from
the beginning. This work is the first of the kind in multi-label
classifier for class-incremental learning. It is useful for real-world
applications such as robotics where streaming data are available and
the number of labels is often unknown. Based on the Extreme Learning
Machine framework, a novel universal classifier with plug and play
capabilities for progressive multi-label classification is developed.
Experimental results on various benchmark synthetic and real datasets
validate the efficiency and effectiveness of our proposed algorithm.

These are lecture notes that are based on the lectures from a class I
taught on the topic of Spectral Graph Methods at UC Berkeley during
the Spring 2015 semester.

General Thinking Concepts (11)

• [cs.DC]Interoperable Convergence of Storage, Networking and
Computation

Micah Beck, Terry Moore, Piotr Luszczek
http://arxiv.org/abs/1706.07519v1

• [cs.LG]Multilayer Spectral Graph Clustering via Convex Layer
Aggregation

Pin-Yu Chen, Alfred O. Hero III
http://arxiv.org/abs/1609.07200v1

• [cs.IT]Hard Clusters Maximize Mutual Information
Bernhard C. Geiger, Rana Ali Amjad
http://arxiv.org/abs/1608.04872v1

1. Inversion

In every form of digital store-and-forward communication, intermediate
forwarding nodes are computers, with attendant memory and processing
resources. This has inevitably given rise to efforts to create a wide
area infrastructure that goes beyond simple store and forward, a
facility that makes more general and varied use of the potential of
this collection of increasingly powerful nodes. Historically, efforts
in this direction predate the advent of globally routed packet
networking. The desire for a converged infrastructure of this kind has
only intensified over the last 30 years, as memory, storage and
processing resources have both increased in density and speed and
decreased in cost. Although there seems to be a general consensus that
it should be possible to define and deploy such a dramatically more
capable wide area facility, a great deal of investment in research
prototypes has yet to produce a credible candidate architecture.
Drawing on technical analysis, historical examples, and case studies,
we present a argument for the hypothesis that in order to realize a
distributed system with the kind of convergent generality and
deployment scalability that might qualify as “future-defining,” we
must build it up from a small set of simple, generic, and limited
abstractions of the low level processing, storage and network
resources of its intermediate nodes.

Multilayer graphs are commonly used for representing different
relations between entities and handling heterogeneous data processing
tasks. New challenges arise in multilayer graph clustering for
assigning clusters to a common multilayer node set and for combining
information from each layer. This paper presents a theoretical
framework for multilayer spectral graph clustering of the nodes via
convex layer aggregation. Under a novel multilayer signal plus noise
model, we provide a phase transition analysis that establishes the
existence of a critical value on the noise level that permits reliable
cluster separation. The analysis also specifies analytical upper and
lower bounds on the critical value, where the bounds become exact when
the clusters have identical sizes. Numerical experiments on synthetic
multilayer graphs are conducted to validate the phase transition
analysis and study the effect of layer weights and noise levels on
clustering reliability.

In this paper, we investigate mutual information as a cost function
for clustering, and show in which cases hard, i.e., deterministic,
clusters are optimal. Using convexity properties of mutual
information, we show that certain formulations of the information
bottleneck problem are solved by hard clusters. Similarly, hard
clusters are optimal for the information-theoretic co-clustering
problem that deals with simultaneous clustering of two dependent data
sets. If both data sets have to be clustered using the same cluster
assignment, hard clusters are not optimal in general. We point at
interesting and practically relevant special cases of this so-called
pairwise clustering problem, for which we can either prove or have
evidence that hard clusters are optimal. Our results thus show that
one can relax the otherwise combinatorial hard clustering problem to a
real-valued optimization problem with the same global optimum.

Otherwise known as thinking through a situation in reverse or thinking
“backwards,” inversion is a problem-solving technique. Often by
considering what we want to avoid rather than what we want to get, we
come up with better solutions. Inversion works not just in mathematics
but in nearly every area of life. As the saying goes, “Just tell me
where I’m going to die so I can never go there.”

• [cs.DC]Optimizing the Performance of Reactive Molecular Dynamics
Simulations for Multi-Core Architectures

Hasan Metin Aktulga, Christopher Knight, Paul Coffman, Kurt A. O’Hearn,
Tzu-Ray Shan, Wei Jiang

http://arxiv.org/abs/1706.07772v1

• [cs.LG]Using Neural Network Formalism to Solve Multiple-Instance
Problems

Tomas Pevny, Petr Somol
http://arxiv.org/abs/1609.07257v1

• [cs.LG]Application of multiview techniques to NHANES dataset
Aileme Omogbai
http://arxiv.org/abs/1608.04783v1

The end of 分类讨论。

1. Falsification / Confirmation Bias

http://www.ytmgym.com ，Reactive molecular dynamics simulations are computationally demanding.
Reaching spatial and temporal scales where interesting scientific
phenomena can be observed requires efficient and scalable
implementations on modern hardware. In this paper, we focus on
optimizing the performance of the widely used LAMMPS/ReaxC package for
multi-core architectures. As hybrid parallelism allows better leverage
the construction of bonded and nonbonded lists, and in the computation
of complex ReaxFF interactions. To mitigate the I/O overheads due to
large volumes of trajectory data produced and to save users the burden
of post-processing, we also develop a novel in-situ tool for molecular
species analysis. We analyze the performance of the resulting
ReaxC-OMP package on Mira, an IBM Blue Gene/Q supercomputer. For PETN
systems of sizes ranging from 32 thousand to 16.6 million particles,
we observe speedups in the range of 1.5-4.5x. We observe sustained
performance improvements for up to 262,144 cores (1,048,576 processes)
of Mira and a weak scaling efficiency of 91.5% in large simulations
containing 16.6 million particles. The in-situ molecular species
analysis tool incurs only insignificant overheads across various
system sizes and run configurations.

Many objects in the real world are difficult to describe by a single
numerical vector of a fixed length, whereas describing them by a set
of vectors is more natural. Therefore, Multiple instance learning
(MIL) techniques have been constantly gaining on importance throughout
last years. MIL formalism represents each object (sample) by a set
(bag) of feature vectors (instances) of fixed length where knowledge
about objects (e.g., class label) is available on bag level but not
necessarily on instance level. Many standard tools including
the problem got formalized in late nineties. In this work we propose a
neural network (NN) based formalism that intuitively bridges the gap
between MIL problem definition and the vast existing knowledge-base of
standard models and classifiers. We show that the proposed NN
formalism is effectively optimizable by a modified back-propagation
algorithm and can reveal unknown patterns inside bags. Comparison to
eight types of classifiers from the prior art on a set of 14 publicly
available benchmark datasets confirms the advantages and accuracy of
the proposed solution.

Disease prediction or classification using health datasets involve
using well-known predictors associated with the disease as features
for the models. This study considers multiple data components of an
individual’s health, using the relationship between variables to
generate features that may improve the performance of disease
classification models. In order to capture information from different
aspects of the data, this project uses a multiview learning approach,
using Canonical Correlation Analysis (CCA), a technique that finds
projections with maximum correlations between two data views. Data
categories collected from the NHANES survey (1999-2014) are used as
views to learn the multiview representations. The usefulness of the
representations is demonstrated by applying them as features in a

What a man wishes, he also believes. Similarly, what we believe is what
we choose to see. This is commonly referred to as the confirmation bias.
It is a deeply ingrained mental habit, both energy-conserving and
comfortable, to look for confirmations of long-held wisdom rather than
violations. Yet the scientific process – including hypothesis
generation, blind testing when needed, and objective statistical rigor –
is designed to root out precisely the opposite, which is why it works so
well when followed.

• [cs.DS]Testing Piecewise Functions
Steve Hanneke, Liu Yang
http://arxiv.org/abs/1706.07669v1

• [cs.MM]Deep Quality: A Deep No-reference Quality Assessment
System

Prajna Paramita Dash, Akshaya Mishra, Alexander Wong
http://arxiv.org/abs/1609.07170v1

• [cs.LG]Dynamic Collaborative Filtering with Compound Poisson
Factorization

Ghassen Jerfel, Mehmet E. Basbug, Barbara E. Engelhardt
http://arxiv.org/abs/1608.04839v1

Farhan在最后一年放弃了帝国理工的学位而成为了wildlife photographer.

The modern scientific enterprise operates under the principle of
falsification: A method is termed scientific if it can be stated in such
a way that a certain defined result would cause it to be proved false.
Pseudo-knowledge and pseudo-science operate and propagate by being
unfalsifiable – as with astrology, we are unable to prove them either
correct or incorrect because the conditions under which they would be
shown false are never stated.

This work explores the query complexity of property testing for
general piecewise functions on the real line, in the active and
passive property testing settings. The results are proven under an
abstract zero-measure crossings condition, which has as special cases
piecewise constant functions and piecewise polynomial functions. We
find that, in the active testing setting, the query complexity of
testing general piecewise functions is independent of the number of
pieces. We also identify the optimal dependence on the number of
pieces in the query complexity of passive testing in the special case
of piecewise constant functions.

Image quality assessment (IQA) continues to garner great interest in
the research community, particularly given the tremendous rise in
consumer video capture and streaming. Despite significant research
effort in IQA in the past few decades, the area of no-reference image
quality assessment remains a great challenge and is largely unsolved.
In this paper, we propose a novel no-reference image quality
assessment system called Deep Quality, which leverages the power of
deep learning to model the complex relationship between visual content
and the perceived quality. Deep Quality consists of a novel
multi-scale deep convolutional neural network, trained to learn to
assess image quality based on training samples consisting of different
distortions and degradations such as blur, Gaussian noise, and
compression artifacts. Preliminary results using the CSIQ benchmark
image quality dataset showed that Deep Quality was able to achieve
strong quality prediction performance (89% patch-level and 98%
image-level prediction accuracy), being able to achieve similar
performance as full-reference IQA methods.

Model-based collaborative filtering analyzes user-item interactions to
infer latent factors that represent user preferences and item
characteristics in order to predict future interactions. Most
collaborative filtering algorithms assume that these latent factors
are static, although it has been shown that user preferences and item
perceptions drift over time. In this paper, we propose a conjugate and
numerically stable dynamic matrix factorization (DCPF) based on
compound Poisson matrix factorization that models the smoothly
drifting latent factors using Gamma-Markov chains. We propose a
numerically stable Gamma chain construction, and then present a
stochastic variational inference approach to estimate the parameters
of our model. We apply our model to time-stamped ratings data sets:
Netflix, Yelp, and Last.fm, where DCPF achieves a higher predictive
accuracy than state-of-the-art static and dynamic factorization
models.

“挣得少一点，房子小一点，车子小一点，但我会很快乐，会真正幸福”

1. Circle of Competence

• [cs.IR]Causal Embeddings for Recommendation
Stephen Bonner, Flavian Vasile
http://arxiv.org/abs/1706.07639v1

• [cs.NE]Deep Learning in Multi-Layer Architectures of Dense
Nuclei

Yonghua Yin, Erol Gelenbe
http://arxiv.org/abs/1609.07160v1

• [cs.LG]Mollifying Networks
Caglar Gulcehre, Marcin Moczulski, Francesco Visin, Yoshua Bengio
http://arxiv.org/abs/1608.04980v1

An idea introduced by Warren Buffett and Charles Munger in relation to
investing: each individual tends to have an area or areas in which they
really, truly know their stuff, their area of special competence. Areas
not inside that circle are problematic because not only are we ignorant
about them, but we may also be ignorant of our own ignorance. Thus, when
we’re making decisions, it becomes important to define and attend to our
special circle, so as to act accordingly.

Recommendations are treatments. While todays recommender systems
attempt to emulate the naturally occurring user behaviour by
predicting either missing entries in the user-item matrix or computing
the most likely continuation of user sessions, we need to start
thinking of recommendations in terms of optimal interventions with
respect to specific goals, such as the increase of number of user
conversions on a E-Commerce website. This objective is known as
Incremental Treatment Effect prediction (ITE) in the causal community.
We propose a new way of factorizing user-item matrices created from a
large sample of biased data collected using a control recommendation
policy and from limited randomized recommendation data collected using
a treatment recommendation policy in order to jointly optimize the
prediction of outcomes of the treatment policy and its incremental
treatment effect with respect to the control policy. We compare our
method against both state-of-the-art factorization methods and against
new approaches of causal recommendation and show significant
improvements in performance.

In dense clusters of neurons in nuclei, cells may interconnect via
soma-to-soma interactions, in addition to conventional synaptic
connections. We illustrate this idea with a multi-layer architecture
(MLA) composed of multiple clusters of recurrent sub-networks of
spiking Random Neural Networks (RNN) with dense soma-to-soma
interactions. We use this RNN-MLA architecture for deep learning. The
inputs to the clusters are normalised by adjusting the external
arrival rates of spikes to each cluster, and then apply this
architectures to learning from multi-channel datasets. We present
numerical results based on both images and sensor based data that show
the value of this RNN-MLA for deep learning.

The optimization of deep neural networks can be more challenging than
traditional convex optimization problems due to the highly non-convex
nature of the loss function, e.g. it can involve pathological
landscapes such as saddle-surfaces that can be difficult to escape for
algorithms based on simple gradient descent. In this paper, we attack
the problem of optimization of highly non-convex neural networks by
starting with a smoothed — or \textit{mollified} — objective
function that gradually has a more non-convex energy landscape during
the training. Our proposition is inspired by the recent studies in
continuation methods: similar to curriculum methods, we begin learning
an easier (possibly convex) objective function and let it evolve
during the training, until it eventually goes back to being the
original, difficult to optimize, objective function. The complexity of
the mollified networks is controlled by a single hyperparameter which
is annealed during the training. We show improvements on various
difficult optimization tasks and establish a relationship with recent
works on continuation methods for neural networks and mollifiers.

1. The Principle of Parsimony (Occam’s Razor)

• [cs.IR]Comparing Neural and Attractiveness-based Visual Features
for Artwork Recommendation

Vicente Dominguez, Pablo Messina, Denis Parra, Domingo Mery, Christoph
Trattner, Alvaro Soto

http://arxiv.org/abs/1706.07515v1

• [cs.NE]Multi-Output Artificial Neural Network for Storm Surge
Prediction in North Carolina

Anton Bezuglov, Brian Blanton, Reinaldo Santiago
http://arxiv.org/abs/1609.07378v1

• [cs.LG]Reinforcement Learning algorithms for regret minimization
in structured Markov Decision Processes

K J Prabuchandran, Tejas Bodas, Theja Tulabandhula
http://arxiv.org/abs/1608.04929v1

normal distribution 的传送门：

Named after the friar William of Ockham, Occam’s Razor is a heuristic by
which we select among competing explanations. Ockham stated that we
should prefer the simplest explanation with the least moving parts: it
is easier to falsify (see: Falsification), easier to understand, and
more likely, on average, to be correct. This principle is not an iron
law but a tendency and a mindset: If all else is equal, it’s more likely
that the simple solution suffices. Of course, we also keep in mind
Einstein’s famous idea (even if apocryphal) that “an idea should be made
as simple as possible, but no simpler.”

Advances in image processing and computer vision in the latest years
have brought about the use of visual features in artwork
recommendation. Recent works have shown that visual features obtained
from pre-trained deep neural networks (DNNs) perform very well for
recommending digital art. Other recent works have shown that explicit
visual features (EVF) based on attractiveness can perform well in
preference prediction tasks, but no previous work has compared DNN
features versus specific attractiveness-based visual features (e.g.
brightness, texture) in terms of recommendation performance. In this
work, we study and compare the performance of DNN and EVF features for
the purpose of physical artwork recommendation using transaction data
from UGallery, an online store of physical paintings. In addition, we
perform an exploratory analysis to understand if DNN embedded features
have some relation with certain EVF. Our results show that DNN
features outperform EVF, that certain EVF features are more suited for
physical artwork recommendation and, finally, we show evidence that
certain neurons in the DNN might be partially encoding visual features
such as brightness, providing an opportunity for explaining
recommendations based on visual neural models.

During hurricane seasons, emergency managers and other decision makers
need accurate and `on-time’ information on potential storm surge
impacts. Fully dynamical computer models, such as the ADCIRC tide,
storm surge, and wind-wave model take several hours to complete a
forecast when configured at high spatial resolution. Additionally,
statically meaningful ensembles of high-resolution models (needed for
uncertainty estimation) cannot easily be computed in near real-time.
This paper discusses an artificial neural network model for storm
surge prediction in North Carolina. The network model provides fast,
real-time storm surge estimates at coastal locations in North
Carolina. The paper studies the performance of the neural network
model vs. other models on synthetic and real hurricane data.

A recent goal in the Reinforcement Learning (RL) framework is to
choose a sequence of actions or a policy to maximize the reward
collected or minimize the regret incurred in a finite time horizon.
For several RL problems in operation research and optimal control, the
optimal policy of the underlying Markov Decision Process (MDP) is
characterized by a known structure. The current state of the art
algorithms do not utilize this known structure of the optimal policy
while minimizing regret. In this work, we develop new RL algorithms
that exploit the structure of the optimal policy to minimize regret.
Numerical experiments on MDPs with structured optimal policies show
that our algorithms have better performance, are easy to implement,
have a smaller run-time and require less number of random number
generations.

1. Hanlon’s Razor

• [cs.IR]Contextual Sequence Modeling for Recommendation with
Recurrent Neural Networks

Elena Smirnova, Flavian Vasile
http://arxiv.org/abs/1706.07684v1

• [cs.NI]Hydra: Leveraging Functional Slicing for Efficient
Distributed SDN Controllers

Yiyang Chang, Ashkan Rezaei, Balajee Vamanan, Jahangir Hasan, Sanjay
Rao, T. N. Vijaykumar

http://arxiv.org/abs/1609.07192v1

• [cs.MM]Towards Music Captioning: Generating Music Playlist
Descriptions

Keunwoo Choi, George Fazekas, Mark Sandler
http://arxiv.org/abs/1608.04868v1

“你们都陷入比赛中，就算你是第一，这种方式又有什么用？你的知识会增长吗？不会，增长的只有压力。这里是大学，不是高压锅。”

Harder to trace in its origin, Hanlon’s Razor states that we should not
attribute to malice that which is more easily explained by stupidity. In
a complex world, this principle helps us avoid extreme paranoia and
ideology, often very hard to escape from, by not generally assuming that
bad results are the fault of a bad actor, although they can be. More
likely, a mistake has been made.

Recommendations can greatly benefit from good representations of the
user state at recommendation time. Recent approaches that leverage
Recurrent Neural Networks (RNNs) for session-based recommendations
have shown that Deep Learning models can provide useful user
representations for recommendation. However, current RNN modeling
approaches summarize the user state by only taking into account the
sequence of items that the user has interacted with in the past,
without taking into account other essential types of context
information such as the associated types of user-item interactions,
the time gaps between events and the time of day for each interaction.
To address this, we propose a new class of Contextual Recurrent Neural
Networks for Recommendation (CRNNs) that can take into account the
contextual information both in the input and output layers and
modifying the behavior of the RNN by combining the context embedding
with the item embedding and more explicitly, in the model dynamics, by
parametrizing the hidden unit transitions as a function of context
information. We compare our CRNNs approach with RNNs and
non-sequential baselines and show good improvements on the next event

The conventional approach to scaling Software Defined Networking (SDN)
controllers today is to partition switches based on network topology,
with each partition being controlled by a single physical controller,
running all SDN applications. However, topological partitioning is
limited by the fact that (i) performance of latency-sensitive (e.g.,
monitoring) SDN applications associated with a given partition may be
impacted by co-located compute-intensive (e.g., route computation)
applications; (ii) simultaneously achieving low convergence time and
response times might be challenging; and (iii) communication between
instances of an application across partitions may increase latencies.
To tackle these issues, in this paper, we explore functional slicing,
a complementary approach to scaling, where multiple SDN applications
belonging to the same topological partition may be placed in
physically distinct servers. We present Hydra, a framework for
distributed SDN controllers based on functional slicing. Hydra chooses
partitions based on convergence time as the primary metric, but places
application instances across partitions in a manner that keeps
response times low while considering communication between
applications of a partition, and instances of an application across
partitions. Evaluations using the Floodlight controller show the
importance and effectiveness of Hydra in simultaneously keeping
convergence times on failures small, while sustaining higher
throughput per partition and ensuring responsiveness to
latency-sensitive applications.

Descriptions are often provided along with recommendations to help
users’ discovery. Recommending automatically generated music playlists
(e.g. personalised playlists) introduces the problem of generating
descriptions. In this paper, we propose a method for generating music
playlist descriptions, which is called as music captioning. In the
proposed method, audio content analysis and natural language
processing are adopted to utilise the information of each track.

1. Second-Order Thinking

• [cs.IR]Specializing Joint Representations for the task of Product
Recommendation

Thomas Nedelec, Elena Smirnova, Flavian Vasile
http://arxiv.org/abs/1706.07625v1

• [cs.SD]Discovering Sound Concepts and Acoustic Relations In
Text

Anurag Kumar, Bhiksha Raj, Ndapandula Nakashole
http://arxiv.org/abs/1609.07384v1

• [cs.NE]Power Series Classification: A Hybrid of LSTM and a Novel

Yuanlong Li, Han Hu, Yonggang Wen, Jun Zhang
http://arxiv.org/abs/1608.04171v2

In all human systems and most complex systems, the second layer of
effects often dwarfs the first layer, yet often goes unconsidered. In
other words, we must consider that effects have effects. Second-order
thinking is best illustrated by the idea of standing on your tiptoes at
a parade: Once one person does it, everyone will do it in order to see,
thus negating the first tiptoer. Now, however, the whole parade audience
suffers on their toes rather than standing firmly on their whole feet.

We propose a unified product embedded representation that is optimized
for the task of retrieval-based product recommendation. To this end,
we introduce a new way to fuse modality-specific product embeddings
into a joint product embedding, in order to leverage both product
content information, such as textual descriptions and images, and
product collaborative filtering signal. By introducing the fusion step
at the very end of our architecture, we are able to train each
modality separately, allowing us to keep a modular architecture that
is preferable in real-world recommendation deployments. We analyze our
performance on normal and hard recommendation setups such as
cold-start and cross-category recommendations and achieve good
performance on a large product shopping dataset.

In this paper we describe approaches for discovering acoustic concepts
and relations in text. The first major goal is to be able to identify
text phrases which contain a notion of audibility and can be termed as
a sound or an acoustic concept. We also propose a method to define an
acoustic scene through a set of sound concepts. We use pattern
matching and parts of speech tags to generate sound concepts from
large scale text corpora. We use dependency parsing and LSTM recurrent
neural network to predict a set of sound concepts for a given acoustic
scene. These methods are not only helpful in creating an acoustic
knowledge base but also directly help in acoustic event and scene
detection research in a variety of ways.

As many applications organize data into temporal sequences, the
problem of time series data classification has been widely studied.
Recent studies show that the 1-nearest neighbor with dynamic time
warping (1NN-DTW) and the long short term memory (LSTM) neural network
can achieve a better performance than other machine learning
algorithms. In this paper, we build a novel time series classification
algorithm hybridizing 1NN-DTW and LSTM, and apply it to a practical
data center power monitoring problem. Firstly, we define a new
distance measurement for the 1NN-DTW classifier, termed as Advancing
Dynamic Time Warping (ADTW), which is non-commutative and non-dynamic
programming. Secondly, we hybridize the 1NN-ADTW and LSTM together. In
particular, a series of auxiliary test samples generated by the linear
combination of the original test sample and its nearest neighbor with
ADTW are utilized to detect which classifier to trust in the hybrid
algorithm. Finally, using the power consumption data from a real data
center, we show that the proposed ADTW can improve the classification
accuracy from about 84% to 89%. Furthermore, with the hybrid
algorithm, the accuracy can be further improved and we achieve an
accuracy up to about 92%. Our research can inspire more studies on
non-commutative distance measurement and the hybrid of the deep
learning models with other traditional models.

1. The Map Is Not the Territory

• [cs.IT]A Combinatorial Methodology for Optimizing Non-Binary
Graph-Based Codes: Theoretical Analysis and Applications in Data
Storage

Ahmed Hareedy, Chinmayi Lanka, Nian Guo, Lara Dolecek
http://arxiv.org/abs/1706.07529v1

• [cs.SD]Novel stochastic properties of the short-time spectrum for
unvoiced pronunciation modeling and synthesis

Xiaodong Zhuang, Nikos E. Mastorakis
http://arxiv.org/abs/1609.07245v1

• [cs.SE]A Proposal for the Measurement and Documentation of
Research Software Sustainability in Interactive Metadata
Repositories

Stephan Druskat
http://arxiv.org/abs/1608.04529v2

The map of reality is not reality itself. If any map were to represent
its actual territory with perfect fidelity, it would be the size of the
territory itself. Thus, no need for a map! This model tells us that
there will always be an imperfect relationship between reality and the
models we use to represent and understand it. This imperfection is a
necessity in order to simplify. It is all we can do to accept this and
act accordingly.

Non-binary (NB) low-density parity-check (LDPC) codes are graph-based
codes that are increasingly being considered as a powerful error
correction tool for modern dense storage devices. The increasing
levels of asymmetry incorporated by the channels underlying modern
dense storage systems exacerbates the error floor problem. In a recent
research, the weight consistency matrix (WCM) framework was introduced
as an effective NB-LDPC code optimization methodology that is suitable
for modern Flash memory and magnetic recording (MR) systems. In this
paper, we provide the in-depth theoretical analysis needed to
understand and properly apply the WCM framework. We focus on general
absorbing sets of type two (GASTs). In particular, we introduce a
novel tree representation of a GAST called the unlabeled GAST tree,
using which we prove that the WCM framework is optimal. Then, we
enumerate the WCMs. We demonstrate the significance of the savings
achieved by the WCM framework in the number of matrices processed to
remove a GAST. Moreover, we provide a linear-algebraic analysis of the
null spaces of WCMs associated with a GAST. We derive the minimum
number of edge weight changes needed to remove a GAST via its WCMs,
along with how to choose these changes. Additionally, we propose a new
set of problematic objects, namely the oscillating sets of type two
(OSTs), which contribute to the error floor of NB-LDPC codes with even
column weights on asymmetric channels, and we show how to customize
the WCM framework to remove OSTs. We also extend the domain of the WCM
framework applications by demonstrating its benefits in optimizing
column weight 5 codes, codes used over Flash channels with soft
information, and spatially-coupled codes. The performance gains
achieved via the WCM framework range between 1 and nearly 2.5 orders
of magnitude in the error floor region over interesting channels.

Stochastic property of speech signal is a fundamental research topic
in speech analysis and processing. In this paper, multiple levels of
randomness in speech signal are discussed, and the stochastic
properties of unvoiced pronunciation are studied in detail, which has
not received sufficient research attention before. The study is based
on the signals of sustained unvoiced pronunciation captured in the
experiments, for which the amplitude and phase values in the
short-time spectrum are studied as random variables. The statistics of
amplitude for each frequency component is studied individually, based
on which a new property of “consistent standard deviation coefficient”
is revealed for the amplitude spectrum of unvoiced pronunciation. The
relationship between the amplitude probability distributions of
different frequency components is further studied, which indicates
that all the frequency components have a common prototype of amplitude
probability distribution. As an adaptive and flexible probability
distribution, the Weibull distribution is adopted to fit the
expectation-normalized amplitude spectrum data. The phase distribution
for the short-time spectrum is also studied, and the results show a
uniform distribution. A synthesis method for unvoiced pronunciation is
proposed based on the Weibull distribution of amplitude and uniform
distribution of phase, which is implemented by STFT with artificially
generated short-time spectrum with random amplitude and phase. The
synthesis results have identical quality of auditory perception as the
original pronunciation, and have similar autocorrelation as that of
the original signal, which proves the effectiveness of the proposed
stochastic model of short-time spectrum for unvoiced pronunciation.

This paper proposes an interactive repository type for research
software metadata which measures and documents software sustainability
by accumulating metadata, and computing sustainability metrics over
them. Such a repository would help to overcome technical barriers to
software sustainability by furthering the discovery and identification
of sustainable software, thereby also facilitating documentation of
research software within the framework of software management plans.

well.”

1. http://www.youminghaishen.com ，Thought Experiments

• [cs.IT]Common-Message Broadcast Channels with Feedback in the
Nonasymptotic Regime: Full Feedback

Kasper Fløe Trillingsgaard, Wei Yang, Giuseppe Durisi, Petar
Popovski

http://arxiv.org/abs/1706.07731v1

• [http://www.osmosis3.com ，math.OC]Screening Rules for Convex Problems
Anant Raj, Jakob Olbrich, Bernd Gärtner, Bernhard Schölkopf, Martin
Jaggi

http://arxiv.org/abs/1609.07478v1

• [cs.SI]Feature Driven and Point Process Approaches for Popularity
Prediction

Swapnil Mishra, Marian-Andrei Rizoiu, Lexing Xie
http://arxiv.org/abs/1608.04862v1