compete,

or die.

Mental Models: The Best Way to Make Intelligent Decisions (113 Models

Explained)

cs.AI – 人工智能

cs.CL – 计算与语言

cs.CR – 加密与安全

cs.CV – 机器视觉与模式识别

cs.CY – 计算与社会

cs.DC – 分布式、并行与集群计算

cs.DS – 数据结构与算法

cs.IR – 信息检索

cs.IT – 信息论

cs.LG – 自动学习

cs.SD – 声音处理

cs.SI – 社交网络与信息网络

gr-qc – 广义相对论与量子宇宙学

math.NT – 数论

math.PR – 概率

math.ST – 统计理论

q-bio.QM – 定量方法

stat.AP – 应用统计

stat.ME – 统计方法论

stat.ML – (统计)机器学习

cs.AI – 人工智能

cs.CL – 计算与语言

cs.CR – 加密与安全

cs.CV – 机器视觉与模式识别

cs.CY – 计算与社会

cs.DC – 分布式、并行与集群计算

cs.DL – 数字图书馆

cs.DS – 数据结构与算法

cs.LG – 自动学习

cs.MM – 多媒体

cs.NE – 神经与进化计算

cs.NI – 网络和互联网体系结构

cs.SD – 声音处理

math.OC – 优化与控制

math.ST – 统计理论

stat.AP – 应用统计

stat.ME – 统计方法论

stat.ML – (统计)机器学习

cs.AI – 人工智能

cs.CL – 计算与语言

cs.CV – 机器视觉与模式识别

cs.CY – 计算与社会

cs.DC – 分布式、并行与集群计算

cs.DL – 数字图书馆

cs.DS – 数据结构与算法

cs.IT – 信息论

cs.LG – 自动学习

cs.MM – 多媒体

cs.NE – 神经与进化计算

cs.SE – 软件工程

cs.SI – 社交网络与信息网络

cs.SY – 系统与控制

math.ST – 统计理论

q-bio.NC – 神经元与认知

stat.AP – 应用统计

stat.ME – 统计方法论

stat.ML – (统计)机器学习

stat.OT – 其他统计学

不知道从什么时候开始，大学生变得有追求多了。

How do you think the most rational people in the world operate their

minds? How do they make better decisions?

• [cs.AI]ECO-AMLP: A Decision Support System using an Enhanced Class

Outlier with Automatic Multilayer Perceptron for Diabetes Prediction

• [cs.AI]Model Selection with Nonlinear Embedding for Unsupervised

Domain Adaptation

• [cs.AI]Toward Goal-Driven Neural Network Models for the Rodent

Whisker-Trigeminal System

• [cs.CL]Comparison of Modified Kneser-Ney and Witten-Bell Smoothing

Techniques in Statistical Language Model of Bahasa Indonesia

• [cs.CL]End-to-end Conversation Modeling Track in DSTC6

• [cs.CL]Named Entity Recognition with stack residual LSTM and

trainable bias decoding

• [cs.CL]Neural Machine Translation with Gumbel-Greedy Decoding

• [cs.CL]Personalization in Goal-Oriented Dialog

• [cs.CR]Integrating self-efficacy into a gamified approach to thwart

phishing attacks

• [cs.CV]Computer-aided implant design for the restoration of cranial

defects

• [cs.CV]Joint Prediction of Depths, Normals and Surface Curvature

from RGB Images using CNNs

• [cs.CV]Listen to Your Face: Inferring Facial Action Units from Audio

Channel

• [cs.CV]Sampling Matters in Deep Embedding Learning

• [cs.CV]Training Adversarial Discriminators for Cross-channel

Abnormal Event Detection in Crowds

• [cs.CY]Computational Controversy

• [cs.CY]Human decisions in moral dilemmas are largely described by

Utilitarianism: virtual car driving study provides guidelines for ADVs

• [cs.CY]Mediated behavioural change in human-machine networks:

exploring network characteristics, trust and motivation

• [cs.DC]Heterogeneous MPSoCs for Mixed Criticality Systems:

Challenges and Opportunities

• [cs.DC]Interoperable Convergence of Storage, Networking and

Computation

• [cs.DC]Optimizing the Performance of Reactive Molecular Dynamics

Simulations for Multi-Core Architectures

• [cs.DS]Testing Piecewise Functions

• [cs.IR]Causal Embeddings for Recommendation

• [cs.IR]Comparing Neural and Attractiveness-based Visual Features for

Artwork Recommendation

• [cs.IR]Contextual Sequence Modeling for Recommendation with

Recurrent Neural Networks

• [cs.IR]Specializing Joint Representations for the task of Product

Recommendation

• [cs.IT]A Combinatorial Methodology for Optimizing Non-Binary

Graph-Based Codes: Theoretical Analysis and Applications in Data

Storage

• [cs.IT]Common-Message Broadcast Channels with Feedback in the

Nonasymptotic Regime: Full Feedback

• [cs.IT]Communication-Aware Computing for Edge Processing

• [cs.IT]Fundamental Limits of Universal Variable-to-Fixed Length

Coding of Parametric Sources

• [cs.IT]Fundamental Limits on Delivery Time in Cloud- and Cache-Aided

Heterogeneous Networks

• [cs.IT]High Performance Non-Binary Spatially-Coupled Codes for Flash

Memories

• [cs.IT]On Single-Antenna Rayleigh Block-Fading Channels at Finite

Blocklength

• [cs.IT]Retrodirective Multi-User Wireless Power Transfer with

Massive MIMO

• [cs.LG]Efficient Approximate Solutions to Mutual Information Based

Global Feature Selection

• [cs.LG]How Much Data is Enough? A Statistical Approach with Case

Study on Longitudinal Driving Behavior

• [cs.SD]Revisiting Autotagging Toward Faultless Instrumental

Playlists Generation

• [cs.SI]Information Diffusion in Social Networks in Two Phases

• [gr-qc]Deep Transfer Learning: A new deep learning glitch

classification method for advanced LIGO

• [math.NT]New cubic self-dual codes of length 54, 60 and 66

• [math.PR]Global algorithms for maximal eigenpair

• [math.ST]Asymmetric Matrix-Valued Covariances for Multivariate

Random Fields on Spheres

• [math.ST]Consistent Estimation in General Sublinear Preferential

Attachment Trees

• [math.ST]Multi-sequence segmentation via score and higher-criticism

tests

• [math.ST]Nonparametric Bayesian estimation of a Hölder continuous

diffusion coefficient

• [math.ST]Shape-constrained partial identification of a population

mean under unknown probabilities of sample selection

• [q-bio.QM]Cross-validation failure: small sample sizes lead to large

error bars

• [stat.AP]A Bayesian approach to modeling mortgage default and

prepayment

• [stat.AP]Causal Inference in Travel Demand Modeling (and the lack

thereof)

• [stat.AP]The Cost of Transportation : Spatial Analysis of US Fuel

Prices

• [stat.ME]Asymptotics of ABC

• [stat.ME]Bayesian Penalized Regression

• [stat.ME]Estimation and adaptive-to-model testing for regressions

with diverging number of predictors

• [stat.ME]Model choice in separate families: A comparison between the

FBST and the Cox test

• [stat.ME]Multivariate Geometric Skew-Normal Distribution

• [stat.ME]Pathwise Least Angle Regression and a Significance Test for

the Elastic Net

• [stat.ME]Point and Interval Estimation of Weibull Parameters Based

on Joint Progressively Censored Data

• [stat.ME]Spatially filtered unconditional quantile regression

• [stat.ML]A Variance Maximization Criterion for Active Learning

• [stat.ML]A-NICE-MC: Adversarial Training for MCMC

• [stat.ML]Query Complexity of Clustering with Side Information

• [cs.AI]Regulating Reward Training by Means of Certainty Prediction

in a Neural Network-Implemented Pong Game

• [cs.CL]AMR-to-text generation as a Traveling Salesman Problem

• [cs.CL]Deep Multi-Task Learning with Shared Memory

• [cs.CL]Incorporating Relation Paths in Neural Relation Extraction

• [cs.CL]Language as a Latent Variable: Discrete Generative Models for

Sentence Compression

• [cs.CR]Building accurate HAV exploiting User Profiling and Sentiment

Analysis

• [cs.CV]EFANNA : An Extremely Fast Approximate Nearest Neighbor

Search Algorithm Based on kNN Graph

• [cs.CV]EgoCap: Egocentric Marker-less Motion Capture with Two

Fisheye Cameras

• [cs.CV]Example-Based Image Synthesis via Randomized Patch-Matching

• [cs.CV]Funnel-Structured Cascade for Multi-View Face Detection with

Alignment-Awareness

• [cs.CV]Real-time Human Pose Estimation from Video with Convolutional

Neural Networks

• [cs.CV]The face-space duality hypothesis: a computational model

• [cs.CY]On the (im)possibility of fairness

• [cs.CY]Tracking the Trackers: Towards Understanding the Mobile

Advertising and Tracking Ecosystem

• [cs.DC]MPI Parallelization of the Resistive Wall Code STARWALL:

Report of the EUROfusion High Level Support Team Project JORSTAR

• [cs.DL]OCR++: A Robust Framework For Information Extraction from

Scholarly Articles

• [cs.DS]Scheduling Under Power and Energy Constraints

• [cs.LG]A Novel Progressive Multi-label Classifier for

Classincremental Data

• [cs.LG]Multilayer Spectral Graph Clustering via Convex Layer

Aggregation

• [cs.LG]Using Neural Network Formalism to Solve Multiple-Instance

Problems

• [cs.MM]Deep Quality: A Deep No-reference Quality Assessment System

• [cs.NE]Deep Learning in Multi-Layer Architectures of Dense Nuclei

• [cs.NE]Multi-Output Artificial Neural Network for Storm Surge

Prediction in North Carolina

• [cs.NI]Hydra: Leveraging Functional Slicing for Efficient

Distributed SDN Controllers

• [cs.SD]Discovering Sound Concepts and Acoustic Relations In Text

• [cs.SD]Novel stochastic properties of the short-time spectrum for

unvoiced pronunciation modeling and synthesis

• [math.OC]Screening Rules for Convex Problems

• [math.ST]A Wald-type test statistic for testing linear hypothesis in

logistic regression models based on minimum density power divergence

estimator

• [math.ST]On the Non-Existence of Unbiased Estimators in Constrained

Estimation Problems

• [math.ST]Robust Confidence Intervals in High-Dimensional

Left-Censored Regression

• [stat.AP]A Few Photons Among Many: Unmixing Signal and Noise for

Photon-Efficient Active Imaging

• [stat.AP]Predicting human-driving behavior to help driverless

vehicles drive: random intercept Bayesian Additive Regression Trees

• [stat.ME]Changepoint Detection in the Presence of Outliers

• [stat.ME]Efficient Feature Selection With Large and High-dimensional

Data

• [stat.ME]Fully Bayesian Estimation and Variable Selection in

Partially Linear Wavelet Models

• [stat.ME]Semiparametric clustered overdispersed multinomial

goodness-of-fit of log-linear models

• [stat.ME]Statistical Modeling for Spatio-Temporal Degradation Data

• [stat.ML]A penalized likelihood method for classification with

matrix-valued predictors

• [stat.ML]Constraint-Based Clustering Selection

• [stat.ML]Estimating Probability Distributions using “Dirac” Kernels

(via Rademacher-Walsh Polynomial Basis Functions)

• [stat.ML]One-vs-Each Approximation to Softmax for Scalable

Estimation of Probabilities

• [cs.AI]Open Problem: Approximate Planning of POMDPs in the class of

Memoryless Policies

• [cs.AI]Practical optimal experiment design with probabilistic

programs

• [cs.CL]An Efficient Character-Level Neural Machine Translation

• [cs.CL]Cohesion and Coalition Formation in the European Parliament:

Roll-Call Votes and Twitter Activities

• [cs.CL]Ensemble of Jointly Trained Deep Neural Network-Based

Acoustic Models for Reverberant Speech Recognition

• [cs.CL]Proceedings of the LexSem+Logics Workshop 2016

• [cs.CL]The Roles of Path-based and Distributional Information in

Recognizing Lexical Semantic Relations

• [cs.CV]An image compression and encryption scheme based on deep

learning

• [cs.CV]Frame- and Segment-Level Features and Candidate Pool

Evaluation for Video Caption Generation

• [cs.CV]Geometry-aware Similarity Learning on SPD Manifolds for

Visual Recognition

• [cs.CV]Globally Variance-Constrained Sparse Representation for Image

Set Compression

• [cs.CV]Large Angle based Skeleton Extraction for 3D Animation

• [cs.CY]Modelling Student Behavior using Granular Large Scale Action

Data from a MOOC

• [cs.DC]Safe Serializable Secure Scheduling: Transactions and the

Trade-off Between Security and Consistency

• [cs.DC]The BioDynaMo Project: Creating a Platform for Large-Scale

Reproducible Biological Simulations

• [cs.DL]Anomalies in the peer-review system: A case study of the

journal of High Energy Physics

• [cs.DS]Faster Sublinear Algorithms using Conditional Sampling

• [cs.DS]Lecture Notes on Spectral Graph Methods

• [cs.IT]Hard Clusters Maximize Mutual Information

• [cs.LG]Application of multiview techniques to NHANES dataset

• [cs.LG]Dynamic Collaborative Filtering with Compound Poisson

Factorization

• [cs.LG]Mollifying Networks

• [cs.LG]Reinforcement Learning algorithms for regret minimization in

structured Markov Decision Processes

• [cs.MM]Towards Music Captioning: Generating Music Playlist

Descriptions

• [cs.NE]Power Series Classification: A Hybrid of LSTM and a Novel

Advancing Dynamic Time Warping

• [cs.SE]A Proposal for the Measurement and Documentation of Research

Software Sustainability in Interactive Metadata Repositories

• [cs.SI]Feature Driven and Point Process Approaches for Popularity

Prediction

• [cs.SI]Learning Latent Local Conversation Modes for Predicting

Community Endorsement in Online Discussions

• [cs.SI]Observer Placement for Source Localization: The Effect of

Budgets and Transmission Variance

• [cs.SY]Graph Distances and Controllability of Networks

• [math.ST]Adaptive confidence sets for matrix completion

• [math.ST]Bayesian Posteriors For Arbitrarily Rare Events

• [q-bio.NC]A Three Spatial Dimension Wave Latent Force Model for

Describing Excitation Sources and Electric Potentials Produced by Deep

Brain Stimulation

• [stat.AP]The Use of Minimal Spanning Trees in Particle Physics

• [stat.AP]Tweedie distributions for fitting semicontinuous health

care utilization cost data

• [stat.ME]A Cautionary Tale: Mediation Analysis Applied to Censored

Survival Data

• [stat.ME]A Measure of Directional Outlyingness with Applications to

Image Data and Video

• [stat.ME]Exact balanced random imputation for sample survey data

• [stat.ME]Globally Homogenous Mixture Components and Local

Heterogeneity of Rank Data

• [stat.ML]A Convolutional Autoencoder for Multi-Subject fMRI Data

Aggregation

• [stat.ML]Clustering Mixed Datasets Using Homogeneity Analysis with

Applications to Big Data

• [stat.ML]Faster Principal Component Regression via Optimal

Polynomial Approximation to sgn(x)

• [stat.ML]Large-scale Learning With Global Non-Decomposable

Objectives

• [stat.ML]Outlier Detection on Mixed-Type Data: An Energy-based

Approach

• [stat.OT]Putting Down Roots: A Graphical Exploration of Community

Attachment

想着要多高的GPA，抢着在大一大二考CPA。

They do it by mentally filing away a massive, but finite amount of

fundamental, unchanging knowledge that can be used in evaluating the

infinite number of unique scenarios which show up in the real world.

·····································

·····································

·····································

接着开始考托，然后杀鸡，各种鸡，鸡阿姨和鸡麦特。

That is how consistently rational and effective thinking is done, and if

we want to learn how to think properly ourselves, we need to figure out

how it’s done. Fortunately, there is a way, and it works.

• [cs.AI]**ECO-AMLP: A Decision Support System using an Enhanced Class
Outlier with Automatic Multilayer Perceptron for Diabetes Prediction**

*Maham Jahangir, Hammad Afzal, Mehreen Ahmed, Khawar Khurshid, Raheel*

Nawaz

Nawaz

http://arxiv.org/abs/1706.07679v1

• [cs.AI]**Regulating Reward Training by Means of Certainty Prediction
in a Neural Network-Implemented Pong Game**

*Matt Oberdorfer, Matt Abuzalaf*

http://arxiv.org/abs/1609.07434v1

• [cs.AI]**Open Problem: Approximate Planning of POMDPs in the class
of Memoryless Policies**

*Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar*

http://arxiv.org/abs/1608.04996v1

四级，六级？拜托别拿出来看了。

Before we dig deeper, let’s start by watching this short video on a

concept called mental models. Then continue on below.

With advanced data analytical techniques, efforts for more accurate

decision support systems for disease prediction are on rise. Surveys

by World Health Organization (WHO) indicate a great increase in number

of diabetic patients and related deaths each year. Early diagnosis of

diabetes is a major concern among researchers and practitioners. The

paper presents an application of \textit{Automatic Multilayer

Perceptron }which\textit{ }is combined with an outlier detection

method \textit{Enhanced Class Outlier Detection using distance based

algorithm }to create a prediction framework named as Enhanced Class

Outlier with Automatic Multi layer Perceptron (ECO-AMLP). A series of

experiments are performed on publicly available Pima Indian Diabetes

Dataset to compare ECO-AMLP with other individual classifiers as well

as ensemble based methods. The outlier technique used in our framework

gave better results as compared to other pre-processing and

classification techniques. Finally, the results are compared with

other state-of-the-art methods reported in literature for diabetes

prediction on PIDD and achieved accuracy of 88.7% bests all other

reported studies.

We present the first reinforcement-learning model to self-improve its

reward-modulated training implemented through a continuously improving

“intuition” neural network. An agent was trained how to play the

arcade video game Pong with two reward-based alternatives, one where

the paddle was placed randomly during training, and a second where the

paddle was simultaneously trained on three additional neural networks

such that it could develop a sense of “certainty” as to how probable

its own predicted paddle position will be to return the ball. If the

agent was less than 95% certain to return the ball, the policy used an

intuition neural network to place the paddle. We trained both

architectures for an equivalent number of epochs and tested learning

performance by letting the trained programs play against a

near-perfect opponent. Through this, we found that the reinforcement

learning model that uses an intuition neural network for placing the

paddle during reward training quickly overtakes the simple

architecture in its ability to outplay the near-perfect opponent,

additionally outscoring that opponent by an increasingly wide margin

after additional epochs of training.

Planning plays an important role in the broad class of decision

theory. Planning has drawn much attention in recent work in the

robotics and sequential decision making areas. Recently, Reinforcement

Learning (RL), as an agent-environment interaction problem, has

brought further attention to planning methods. Generally in RL, one

can assume a generative model, e.g. graphical models, for the

environment, and then the task for the RL agent is to learn the model

parameters and find the optimal strategy based on these learnt

parameters. Based on environment behavior, the agent can assume

various types of generative models, e.g. Multi Armed Bandit for a

static environment, or Markov Decision Process (MDP) for a dynamic

environment. The advantage of these popular models is their

simplicity, which results in tractable methods of learning the

parameters and finding the optimal policy. The drawback of these

models is again their simplicity: these models usually underfit and

underestimate the actual environment behavior. For example, in

robotics, the agent usually has noisy observations of the environment

inner state and MDP is not a suitable model. More complex models like

Partially Observable Markov Decision Process (POMDP) can compensate

for this drawback. Fitting this model to the environment, where the

partial observation is given to the agent, generally gives dramatic

performance improvement, sometimes unbounded improvement, compared to

MDP. In general, finding the optimal policy for the POMDP model is

computationally intractable and fully non convex, even for the class

of memoryless policies. The open problem is to come up with a method

to find an exact or an approximate optimal stochastic memoryless

policy for POMDP models.

拖家带口着的忙活着找关系找实习。

It’s not that complicated, right?

• [cs.AI]**Model Selection with Nonlinear Embedding for Unsupervised
Domain Adaptation**

*Hemanth Venkateswara, Shayok Chakraborty, Troy McDaniel, Sethuraman*

Panchanathan

Panchanathan

http://arxiv.org/abs/1706.07527v1

• [cs.CL]**AMR-to-text generation as a Traveling Salesman Problem**

*Linfeng Song, Yue Zhang, Xiaochang Peng, Zhiguo Wang, Daniel Gildea*

http://arxiv.org/abs/1609.07451v1

• [cs.AI]**Practical optimal experiment design with probabilistic
programs**

*Long Ouyang, Michael Henry Tessler, Daniel Ly, Noah Goodman*

http://arxiv.org/abs/1608.05046v1

还没到毕业，resume上俨然已是各种五百强的记录了。

The idea for building a “latticework” of mental models comes from

Charlie Munger, Vice Chairman of Berkshire Hathaway and one of the

finest thinkers in the world.

Domain adaptation deals with adapting classifiers trained on data from

a source distribution, to work effectively on data from a target

distribution. In this paper, we introduce the Nonlinear Embedding

Transform (NET) for unsupervised domain adaptation. The NET reduces

cross-domain disparity through nonlinear domain alignment. It also

embeds the domain-aligned data such that similar data points are

clustered together. This results in enhanced classification. To

determine the parameters in the NET model (and in other unsupervised

domain adaptation models), we introduce a validation procedure by

sampling source data points that are similar in distribution to the

target data. We test the NET and the validation procedure using

popular image datasets and compare the classification results across

competitive procedures for unsupervised domain adaptation.

The task of AMR-to-text generation is to generate grammatical text

that sustains the semantic meaning for a given AMR graph. We at- tack

the task by first partitioning the AMR graph into smaller fragments,

and then generating the translation for each fragment, before finally

deciding the order by solving an asymmetric generalized traveling

salesman problem (AGTSP). A Maximum Entropy classifier is trained to

estimate the traveling costs, and a TSP solver is used to find the

optimized solution. The final model reports a BLEU score of 22.44 on

the SemEval-2016 Task8 dataset.

Scientists often run experiments to distinguish competing theories.

This requires patience, rigor, and ingenuity – there is often a large

space of possible experiments one could run. But we need not comb this

space by hand – if we represent our theories as formal models and

explicitly declare the space of experiments, we can automate the

search for good experiments, looking for those with high expected

information gain. Here, we present a general and principled approach

to experiment design based on probabilistic programming languages

(PPLs). PPLs offer a clean separation between declaring problems and

solving them, which means that the scientist can automate experiment

design by simply declaring her model and experiment spaces in the PPL

without having to worry about the details of calculating information

gain. We demonstrate our system in two case studies drawn from

cognitive psychology, where we use it to design optimal experiments in

the domains of sequence prediction and categorization. We find strong

empirical validation that our automatically designed experiments were

indeed optimal. We conclude by discussing a number of interesting

questions for future research.

然后呢。

Munger’s system is akin to “cross-training for the mind.” Instead of

siloing ourselves in the small, limited areas we may have studied in

school, we study a broadly useful set of knowledge about the world,

which will serve us in all parts of life.

• [cs.AI]**Toward Goal-Driven Neural Network Models for the Rodent
Whisker-Trigeminal System**

*Chengxu Zhuang, Jonas Kubilius, Mitra Hartmann, Daniel Yamins*

http://arxiv.org/abs/1706.07555v1

• [cs.CL]**Deep Multi-Task Learning with Shared Memory**

*Pengfei Liu, Xipeng Qiu, Xuanjing Huang*

http://arxiv.org/abs/1609.07222v1

• [cs.CL]**An Efficient Character-Level Neural Machine Translation**

*Shenjian Zhao, Zhihua Zhang*

http://arxiv.org/abs/1608.04738v1

然后世上就又多了一个查图尔。

In a famous speech in the 1990s, Munger explained his novel approach to

gaining practical wisdom:

In large part, rodents see the world through their whiskers, a

powerful tactile sense enabled by a series of brain areas that form

the whisker-trigeminal system. Raw sensory data arrives in the form of

mechanical input to the exquisitely sensitive, actively-controllable

whisker array, and is processed through a sequence of neural circuits,

eventually arriving in cortical regions that communicate with

decision-making and memory areas. Although a long history of

experimental studies has characterized many aspects of these

processing stages, the computational operations of the

whisker-trigeminal system remain largely unknown. In the present work,

we take a goal-driven deep neural network (DNN) approach to modeling

these computations. First, we construct a biophysically-realistic

model of the rat whisker array. We then generate a large dataset of

whisker sweeps across a wide variety of 3D objects in highly-varying

poses, angles, and speeds. Next, we train DNNs from several distinct

architectural families to solve a shape recognition task in this

dataset. Each architectural family represents a structurally-distinct

hypothesis for processing in the whisker-trigeminal system,

corresponding to different ways in which spatial and temporal

information can be integrated. We find that most networks perform

poorly on the challenging shape recognition task, but that specific

architectures from several families can achieve reasonable performance

levels. Finally, we show that Representational Dissimilarity Matrices

(RDMs), a tool for comparing population codes between neural systems,

can separate these higher-performing networks with data of a type that

could plausibly be collected in a neurophysiological or imaging

experiment. Our results are a proof-of-concept that goal-driven DNN

networks of the whisker-trigeminal system are potentially within

reach.

Neural network based models have achieved impressive results on

various specific tasks. However, in previous works, most models are

learned separately based on single-task supervised objectives, which

often suffer from insufficient training data. In this paper, we

propose two deep architectures which can be trained jointly on

multiple related tasks. More specifically, we augment neural model

with an external memory, which is shared by several tasks. Experiments

on two groups of text classification tasks show that our proposed

architectures can improve the performance of a task with the help of

other related tasks.

Neural machine translation aims at building a single large neural

network that can be trained to maximize translation performance. The

encoder-decoder architecture with an attention mechanism achieves a

translation performance comparable to the existing state-of-the-art

phrase-based systems on the task of English-to-French translation.

However, the use of large vocabulary becomes the bottleneck in both

training and improving the performance. In this paper, we propose an

efficient architecture to train a deep character-level neural machine

translation by introducing a decimator and an interpolator. The

decimator is used to sample the source sequence before encoding while

the interpolator is used to resample after decoding. Such a deep model

has two major advantages. It avoids the large vocabulary issue

radically; at the same time, it is much faster and more

memory-efficient in training than conventional character-based models.

More interestingly, our model is able to translate the misspelled word

like human beings.

住着豪宅，开着兰博，电了小鸡鸡，放着无声的屁。

Well, the first rule is that you can’t really know anything if you just

remember isolated facts and try and bang ’em back. If the facts don’t

hang together on a latticework of theory, you don’t have them in a

usable form.

• [cs.CL]**Comparison of Modified Kneser-Ney and Witten-Bell Smoothing
Techniques in Statistical Language Model of Bahasa Indonesia**

*Ismail Rusli*

http://arxiv.org/abs/1706.07786v1

• [cs.CL]**Incorporating Relation Paths in Neural Relation
Extraction**

*Wenyuan Zeng, Yankai Lin, Zhiyuan Liu, Maosong Sun*

http://arxiv.org/abs/1609.07479v1

• [cs.CL]**Cohesion and Coalition Formation in the European
Parliament: Roll-Call Votes and Twitter Activities**

*Darko Cherepnalkoski, Andreas Karpf, Igor Mozetic, Miha Grcar*

http://arxiv.org/abs/1608.04917v1

整个人生就像一个无声的屁。

You’ve got to have models in your head. And you’ve got to array your

experience both vicarious and direct on this latticework of models. You

may have noticed students who just try to remember and pound back what

is remembered. Well, they fail in school and in life. You’ve got to hang

experience on a latticework of models in your head.

Smoothing is one technique to overcome data sparsity in statistical

language model. Although in its mathematical definition there is no

explicit dependency upon specific natural language, different natures

of natural languages result in different effects of smoothing

techniques. This is true for Russian language as shown by Whittaker

(1998). In this paper, We compared Modified Kneser-Ney and Witten-Bell

smoothing techniques in statistical language model of Bahasa

Indonesia. We used train sets of totally 22M words that we extracted

from Indonesian version of Wikipedia. As far as we know, this is the

largest train set used to build statistical language model for Bahasa

Indonesia. The experiments with 3-gram, 5-gram, and 7-gram showed that

Modified Kneser-Ney consistently outperforms Witten-Bell smoothing

technique in term of perplexity values. It is interesting to note that

our experiments showed 5-gram model for Modified Kneser-Ney smoothing

technique outperforms that of 7-gram. Meanwhile, Witten-Bell smoothing

is consistently improving over the increase of n-gram order.

Distantly supervised relation extraction has been widely used to find

novel relational facts from plain text. To predict the relation

between a pair of two target entities, existing methods solely rely on

those direct sentences containing both entities. In fact, there are

also many sentences containing only one of the target entities, which

provide rich and useful information for relation extraction. To

address this issue, we build inference chains between two target

entities via intermediate entities, and propose a path-based neural

relation extraction model to encode the relational semantics from both

direct sentences and inference chains. Experimental results on

real-world datasets show that, our model can make full use of those

sentences containing only one target entity, and achieves significant

and consistent improvements on relation extraction as compared with

baselines.

We study the cohesion within and the coalitions between political

groups in the Eighth European Parliament (2014–2019) by analyzing two

entirely different aspects of the behavior of the Members of the

European Parliament (MEPs) in the policy-making processes. On one

hand, we analyze their co-voting patterns and, on the other, their

retweeting behavior. We make use of two diverse datasets in the

analysis. The first one is the roll-call vote dataset, where cohesion

is regarded as the tendency to co-vote within a group, and a coalition

is formed when the members of several groups exhibit a high degree of

co-voting agreement on a subject. The second dataset comes from

Twitter; it captures the retweeting (i.e., endorsing) behavior of the

MEPs and implies cohesion (retweets within the same group) and

coalitions (retweets between groups) from a completely different

perspective. We employ two different methodologies to analyze the

cohesion and coalitions. The first one is based on Krippendorff’s

Alpha reliability, used to measure the agreement between raters in

data-analysis scenarios, and the second one is based on Exponential

Random Graph Models, often used in social-network analysis. We give

general insights into the cohesion of political groups in the European

Parliament, explore whether coalitions are formed in the same way for

different policy areas, and examine to what degree the retweeting

behavior of MEPs corresponds to their co-voting patterns. A novel and

interesting aspect of our work is the relationship between the

co-voting and retweeting patterns.

扩散到城市中去，谁也没有关注到你。

What are the models? Well, the first rule is that you’ve got to have

multiple models because if you just have one or two that you’re using,

the nature of human psychology is such that you’ll torture reality so

that it fits your models, or at least you’ll think it does. …

• [cs.CL]**End-to-end Conversation Modeling Track in DSTC6**

*Chiori Hori, Takaaki Hori*

http://arxiv.org/abs/1706.07440v1

• [cs.CL]**Language as a Latent Variable: Discrete Generative Models
for Sentence Compression**

*Yishu Miao, Phil Blunsom*

http://arxiv.org/abs/1609.07317v1

• [cs.CL]**Ensemble of Jointly Trained Deep Neural Network-Based
Acoustic Models for Reverberant Speech Recognition**

*Jeehye Lee, Myungin Lee, Joon-Hyuk Chang*

http://arxiv.org/abs/1608.04983v1

我不知道如何定义成功。

And the models have to come from multiple disciplines because all the

wisdom of the world is not to be found in one little academic

department. That’s why poetry professors, by and large, are so unwise in

a worldly sense. They don’t have enough models in their heads. So you’ve

got to have models across a fair array of disciplines.

End-to-end training of neural networks is a promising approach to

automatic construction of dialog systems using a human-to-human dialog

corpus. Recently, Vinyals et al. tested neural conversation models

using OpenSubtitles. Lowe et al. released the Ubuntu Dialogue Corpus

for researching unstructured multi-turn dialogue systems. Furthermore,

the approach has been extended to accomplish task oriented dialogs to

provide information properly with natural conversation. For example,

Ghazvininejad et al. proposed a knowledge grounded neural conversation

model [3], where the research is aiming at combining conversational

dialogs with task-oriented knowledge using unstructured data such as

Twitter data for conversation and Foursquare data for external

knowledge.However, the task is still limited to a restaurant

information service, and has not yet been tested with a wide variety

of dialog tasks. In addition, it is still unclear how to create

intelligent dialog systems that can respond like a human agent. In

consideration of these problems, we proposed a challenge track to the

6th dialog system technology challenges (DSTC6) using human-to-human

dialog data to mimic human dialog behaviors. The focus of the

challenge track is to train end-to-end conversation models from

human-to-human conversation and accomplish end-to-end dialog tasks in

various situations assuming a customer service, in which a system

plays a role of human agent and generates natural and informative

sentences in response to user’s questions or comments given dialog

context.

In this work we explore deep generative models of text in which the

latent representation of a document is itself drawn from a discrete

language model distribution. We formulate a variational auto-encoder

for inference in this model and apply it to the task of compressing

sentences. In this application the generative model first draws a

latent summary sentence from a background language model, and then

subsequently draws the observed sentence conditioned on this latent

summary. In our empirical evaluation we show that generative

formulations of both abstractive and extractive compression yield

state-of-the-art results when trained on a large amount of supervised

data. Further, we explore semi-supervised compression scenarios where

we show that it is possible to achieve performance competitive with

previously proposed supervised models while training on a fraction of

the supervised data.

Distant speech recognition is a challenge, particularly due to the

corruption of speech signals by reverberation caused by large

distances between the speaker and microphone. In order to cope with a

wide range of reverberations in real-world situations, we present

novel approaches for acoustic modeling including an ensemble of deep

neural networks (DNNs) and an ensemble of jointly trained DNNs. First,

multiple DNNs are established, each of which corresponds to a

different reverberation time 60 (RT60) in a setup step. Also, each

model in the ensemble of DNN acoustic models is further jointly

trained, including both feature mapping and acoustic modeling, where

the feature mapping is designed for the dereverberation as a

front-end. In a testing phase, the two most likely DNNs are chosen

from the DNN ensemble using maximum a posteriori (MAP) probabilities,

computed in an online fashion by using maximum likelihood (ML)-based

blind RT60 estimation and then the posterior probability outputs from

two DNNs are combined using the ML-based weights as a simple average.

Extensive experiments demonstrate that the proposed approach leads to

substantial improvements in speech recognition accuracy over the

conventional DNN baseline systems under diverse reverberant

conditions.

印度人做工程师，美国人做医生律师，中国人。。（和谐，和谐）

You may say, “My God, this is already getting way too tough.” But,

fortunately, it isn’t that tough because 80 or 90 important models will

carry about 90% of the freight in making you a worldly wise person. And,

of those, only a mere handful really carry very heavy freight.(1)

• [cs.CL]**Named Entity Recognition with stack residual LSTM and
trainable bias decoding**

*Quan Tran, Andrew MacKinlay, Antonio Jimeno Yepes*

http://arxiv.org/abs/1706.07598v1

• [cs.CR]**Building accurate HAV exploiting User Profiling and
Sentiment Analysis**

*Alan Ferrari, Angelo Consoli*

http://arxiv.org/abs/1609.07302v1

• [cs.CL]**Proceedings of the LexSem+Logics Workshop 2016**

*Steven Neale, Valeria de Paiva, Arantxa Otegi, Alexandre Rademaker*

http://arxiv.org/abs/1608.04767v1

但为了成为某师，反而不会做一个某人了。

Taking Munger’s concept as our starting point, we can figure out how to

use our brains more effectively by building our own latticework of

mental models.

Recurrent Neural Net models are the state-of-the-art for Named Entity

Recognition (NER). We present two innovations to improve the

performance of these models. The first innovation is the introduction

of residual connections between the Stacked Recurrent Neural Network

model to address the degradation problem of deep neural networks. The

second innovation is a bias decoding mechanism that allows the trained

system to adapt to a non-differentiable and externally computed

objective, such as the F-measure, to address the limitations of

traditional loss functions that optimize for accuracy. Our work

improves the state-of-the-art results for both Spanish and English

languages on the standard train/development/test split of the CoNLL

2003 Shared Task NER dataset.

Social Engineering (SE) is one of the most dangerous aspect an

attacker can use against a given entity (private citizen, industry,

government, …). In order to perform SE attacks, it is necessary to

collect as much information as possible about the target (or

victim(s)). The aim of this paper is to report the details of an

activity which took to the development of an automatic tool that

extracts, categorizes and summarizes the target interests, thus

possible weaknesses with respect to specific topics. Data is collected

from the user’s activity on social networks, parsed and analyzed using

text mining techniques. The main contribution of the proposed tool

consists in delivering some reports that allow the citizen,

institutions as well as private bodies the screening of their exposure

to SE attacks, with a strong awareness potential that will be

reflected in a decrease of the risks and a good opportunity to save

money.

Lexical semantics continues to play an important role in driving

research directions in NLP, with the recognition and understanding of

context becoming increasingly important in delivering successful

outcomes in NLP tasks. Besides traditional processing areas such as

word sense and named entity disambiguation, the creation and

maintenance of dictionaries, annotated corpora and resources have

become cornerstones of lexical semantics research and produced a

wealth of contextual information that NLP processes can exploit. New

efforts both to link and construct from scratch such information – as

Linked Open Data or by way of formal tools coming from logic,

ontologies and automated reasoning – have increased the

interoperability and accessibility of resources for lexical and

computational semantics, even in those languages for which they have

previously been limited. LexSem+Logics 2016 combines the 1st Workshop

on Lexical Semantics for Lesser-Resources Languages and the 3rd

Workshop on Logics and Ontologies. The accepted papers in our program

covered topics across these two areas, including: the encoding of

plurals in Wordnets, the creation of a thesaurus from multiple sources

based on semantic similarity metrics, and the use of cross-lingual

treebanks and annotations for universal part-of-speech tagging. We

also welcomed talks from two distinguished speakers: on Portuguese

lexical knowledge bases (different approaches, results and their

application in NLP tasks) and on new strategies for open information

extraction (the capture of verb-based propositions from massive text

corpora).

大学只是兵工厂，为城市里输送更多的战俘，加入到硝烟中去。

Building the Latticework

• [cs.CL]**Neural Machine Translation with Gumbel-Greedy Decoding**

*Jiatao Gu, Daniel Jiwoong Im, Victor O. K. Li*

http://arxiv.org/abs/1706.07518v1

• [cs.CV]**EFANNA : An Extremely Fast Approximate Nearest Neighbor
Search Algorithm Based on kNN Graph**

*Cong Fu, Deng Cai*

http://arxiv.org/abs/1609.07228v1

• [cs.CL]**The Roles of Path-based and Distributional Information in
Recognizing Lexical Semantic Relations**

*Vered Shwartz, Ido Dagan*

http://arxiv.org/abs/1608.05014v1

compete,

or die.

The central principle of the mental-models approach is that you must

have a large number of them, and they must be fundamentally lasting

ideas.

Previous neural machine translation models used some heuristic search

algorithms (e.g., beam search) in order to avoid solving the maximum a

posteriori problem over translation sentences at test time. In this

paper, we propose the Gumbel-Greedy Decoding which trains a generative

network to predict translation under a trained model. We solve such a

problem using the Gumbel-Softmax reparameterization, which makes our

generative network differentiable and trainable through standard

stochastic gradient methods. We empirically demonstrate that our

proposed model is effective for generating sequences of discrete

words.

Approximate nearest neighbor (ANN) search is a fundamental problem in

many areas of data mining, machine learning and computer vision. The

performance of traditional hierarchical structure (tree) based methods

decreases as the dimensionality of data grows, while hashing based

methods usually lack efficiency in practice. Recently, the graph based

methods have drawn considerable attention. The main idea is that

\emph{a neighbor of a neighbor is also likely to be a neighbor},

which we refer as \emph{NN-expansion}. These methods construct a

$k$-nearest neighbor ($k$NN) graph offline. And at online search

stage, these methods find candidate neighbors of a query point in some

way (\eg, random selection), and then check the neighbors of these

candidate neighbors for closer ones iteratively. Despite some

promising results, there are mainly two problems with these

approaches: 1) These approaches tend to converge to local optima. 2)

Constructing a $k$NN graph is time consuming. We find that these two

problems can be nicely solved when we provide a good initialization

for NN-expansion. In this paper, we propose EFANNA, an extremely fast

approximate nearest neighbor search algorithm based on $k$NN Graph.

Efanna nicely combines the advantages of hierarchical structure based

methods and nearest-neighbor-graph based methods. Extensive

experiments have shown that EFANNA outperforms the state-of-art

algorithms both on approximate nearest neighbor search and approximate

nearest neighbor graph construction. To the best of our knowledge,

EFANNA is the fastest algorithm so far both on approximate nearest

neighbor graph construction and approximate nearest neighbor search. A

library EFANNA based on this research is released on Github.

Recognizing various semantic relations between terms is crucial for

many NLP tasks. While path-based and distributional information

sources are considered complementary, the strong results the latter

showed on recent datasets suggested that the former’s contribution

might have become obsolete. We follow the recent success of an

integrated neural method for hypernymy detection (Shwartz et al.,

2016) and extend it to recognize multiple relations. We demonstrate

that these two information sources are indeed complementary, and

analyze the contributions of each source.

假如人生是一个binary variable, 只有两个结果，success 为1， failure为0.

那Success vs Failure的probability distribution就是一个logistic

regression。

As with physical tools, the lack of a mental tool at a crucial moment

can lead to a bad result, and the use of a wrong mental tool is even

worse.

• [cs.CL]**Personalization in Goal-Oriented Dialog**

*Chaitanya K. Joshi, Fei Mi, Boi Faltings*

http://arxiv.org/abs/1706.07503v1

• [cs.CV]**EgoCap: Egocentric Marker-less Motion Capture with Two
Fisheye Cameras**

*Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutdinov,*

Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, Christian

Theobalt

Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, Christian

Theobalt

http://arxiv.org/abs/1609.07306v1

• [cs.CV]**An image compression and encryption scheme based on deep
learning**

*Fei Hu, Changjiu Pu, Haowei Gao, Mengzi Tang, Li Li*

http://arxiv.org/abs/1608.05001v1

logistic regression 的传送门：

If this seems self-evident, it’s actually a very unnatural way to think.

Without the right training, most minds take the wrong approach. They

prefer to solve problems by asking: Which ideas do I already love and

know deeply, and how can I apply them to the situation at hand?

Psychologists call this tendency the “Availability Heuristic” and its

power is well documented.

The main goal of modelling human conversation is to create agents

which can interact with people in both open-ended and goal-oriented

scenarios. End-to-end trained neural dialog systems are an important

line of research for such generalized dialog models as they do not

resort to any situation-specific handcrafting of rules. Modelling

personalization of conversation in such agents is important for them

to be truly ‘smart’ and to integrate seamlessly into the lives of

human beings. However, the topic has been largely unexplored by

researchers as there are no existing corpora for training dialog

systems on conversations that are influenced by the profiles of the

speakers involved. In this paper, we present a new dataset of

goal-oriented dialogs with profiles attached to them. We also

introduce a framework for analyzing how systems model personalization

in addition to performing the task associated with each dialog.

Although no existing model was able to sufficiently solve our tasks,

we provide baselines using a variety of learning methods and

investigate in detail the shortcomings of an end-to-end dialog system

based on Memory Networks. Our dataset and experimental code are

publicly available at

https://github.com/chaitjo/personalized-dialog

Marker-based and marker-less optical skeletal motion-capture methods

use an outside-in arrangement of cameras placed around a scene, with

viewpoints converging on the center. They often create discomfort by

possibly needed marker suits, and their recording volume is severely

restricted and often constrained to indoor scenes with controlled

backgrounds. Alternative suit-based systems use several inertial

measurement units or an exoskeleton to capture motion. This makes

capturing independent of a confined volume, but requires substantial,

often constraining, and hard to set up body instrumentation. We

therefore propose a new method for real-time, marker-less and

egocentric motion capture which estimates the full-body skeleton pose

from a lightweight stereo pair of fisheye cameras that are attached to

a helmet or virtual reality headset. It combines the strength of a new

generative pose estimation framework for fisheye views with a

ConvNet-based body-part detector trained on a large new dataset. Our

inside-in method captures full-body motion in general indoor and

outdoor scenes, and also crowded scenes with many people in close

vicinity. The captured user can freely move around, which enables

reconstruction of larger-scale activities and is particularly useful

in virtual reality to freely roam and interact, while seeing the fully

motion-captured virtual body.

Stacked Auto-Encoder (SAE) is a kind of deep learning algorithm for

unsupervised learning. Which has multi layers that project the vector

representation of input data into a lower vector space. These

projection vectors are dense representations of the input data. As a

result, SAE can be used for image compression. Using chaotic logistic

map, the compression ones can further be encrypted. In this study, an

application of image compression and encryption is suggested using SAE

and chaotic logistic map. Experiments show that this application is

feasible and effective. It can be used for image transmission and

image protection on internet simultaneously.

尽管1和0是discrete的，但是每个人都在无限向着1前进，而有的人就在坡上停下了，有的人从坡上滚下了。

You know the adage “To the man with only a hammer, everything starts

looking like a nail.” Such narrow-minded thinking feels entirely natural

to us, but it leads to far too many misjudgments. You probably do it

every single day without knowing it.

• [cs.CR]**Integrating self-efficacy into a gamified approach to
thwart phishing attacks**

*Nalin Asanka Gamagedara Arachchilage, Mumtaz Abdul Hameed*

http://arxiv.org/abs/1706.07748v1

• [cs.CV]**Example-Based Image Synthesis via Randomized
Patch-Matching**

*Yi Ren, Yaniv Romano, Michael Elad*

http://arxiv.org/abs/1609.07370v1

• [cs.CV]**Frame- and Segment-Level Features and Candidate Pool
Evaluation for Video Caption Generation**

*Rakshith Shetty, Jorma Laaksonen*

http://arxiv.org/abs/1608.04959v1

If Success：

It’s not that you don’t have some good ideas in your head. You probably

do! No competent adult is a total klutz. It’s just that we tend to be

very limited in our good ideas, and we overuse them. This combination

makes our good ideas just as dangerous as bad ones!

Security exploits can include cyber threats such as computer programs

that can disturb the normal behavior of computer systems (viruses),

unsolicited e-mail (spam), malicious software (malware), monitoring

software (spyware), attempting to make computer resources unavailable

to their intended users (Distributed Denial-of-Service or DDoS

attack), the social engineering, and online identity theft (phishing).

One such cyber threat, which is particularly dangerous to computer

users is phishing. Phishing is well known as online identity theft,

which targets to steal victims’ sensitive information such as

username, password and online banking details. This paper focuses on

designing an innovative and gamified approach to educate individuals

about phishing attacks. The study asks how one can integrate

self-efficacy, which has a co-relation with the user’s knowledge, into

an anti-phishing educational game to thwart phishing attacks? One of

the main reasons would appear to be a lack of user knowledge to

prevent from phishing attacks. Therefore, this research investigates

the elements that influence (in this case, either conceptual or

procedural knowledge or their interaction effect) and then integrate

them into an anti-phishing educational game to enhance people’s

phishing prevention behaviour through their motivation.

Image and texture synthesis is a challenging task that has long been

drawing attention in the fields of image processing, graphics, and

machine learning. This problem consists of modelling the desired type

of images, either through training examples or via a parametric

modeling, and then generating images that belong to the same

statistical origin. This work addresses the image synthesis task,

focusing on two specific families of images — handwritten digits and

face images. This paper offers two main contributions. First, we

suggest a simple and intuitive algorithm capable of generating such

images in a unified way. The proposed approach taken is pyramidal,

consisting of upscaling and refining the estimated image several

times. For each upscaling stage, the algorithm randomly draws small

patches from a patch database, and merges these to form a coherent and

novel image with high visual quality. The second contribution is a

general framework for the evaluation of the generation performance,

which combines three aspects: the likelihood, the originality and the

spread of the synthesized images. We assess the proposed synthesis

scheme and show that the results are similar in nature, and yet

different from the ones found in the training set, suggesting that

true synthesis effect has been obtained.

We present our submission to the Microsoft Video to Language Challenge

of generating short captions describing videos in the challenge

dataset. Our model is based on the encoder–decoder pipeline, popular

in image and video captioning systems. We propose to utilize two

different kinds of video features, one to capture the video content in

terms of objects and attributes, and the other to capture the motion

and action information. Using these diverse features we train models

specializing in two separate input sub-domains. We then train an

evaluator model which is used to pick the best caption from the pool

of candidates generated by these domain expert models. We argue that

this approach is better suited for the current video captioning task,

compared to using a single model, due to the diversity in the dataset.

Efficacy of our method is proven by the fact that it was rated best in

MSR Video to Language Challenge, as per human evaluation.

Additionally, we were ranked second in the automatic evaluation

metrics based table.

践踏完大部分人的肩膀到达顶端的人，是有权利释放自己的欲望的。

The great investor and teacher Benjamin Graham explained it best:

• [cs.CV]**Computer-aided implant design for the restoration of
cranial defects**

*Xiaojun Chen, Lu Xu, Xing Li, Jan Egger*

http://arxiv.org/abs/1706.07649v1

• [cs.CV]**Funnel-Structured Cascade for Multi-View Face Detection
with Alignment-Awareness**

*Shuzhe Wu, Meina Kan, Zhenliang He, Shiguang Shan, Xilin Chen*

http://arxiv.org/abs/1609.07304v1

• [cs.CV]**Geometry-aware Similarity Learning on SPD Manifolds for
Visual Recognition**

*Zhiwu Huang, Ruiping Wang, Xianqiu Li, Wenxian Liu, Shiguang Shan, Luc*

Van Gool, Xilin Chen

Van Gool, Xilin Chen

http://arxiv.org/abs/1608.04914v1

以为踩过万难之后，最终等候着的是光明。

You can get in way more trouble with a good idea than a bad idea,

because you forget that the good idea has limits.

Patient-specific cranial implants are important and necessary in the

surgery of cranial defect restoration. However, traditional methods of

manual design of cranial implants are complicated and time-consuming.

Our purpose is to develop a novel software named EasyCrania to design

the cranial implants conveniently and efficiently. The process can be

divided into five steps, which are mirroring model, clipping surface,

surface fitting, the generation of the initial implant and the

generation of the final implant. The main concept of our method is to

use the geometry information of the mirrored model as the base to

generate the final implant. The comparative studies demonstrated that

the EasyCrania can improve the efficiency of cranial implant design

significantly. And, the intra- and inter-rater reliability of the

software were stable, which were 87.07+/-1.6% and 87.73+/-1.4%

respectively.

Multi-view face detection in open environment is a challenging task

due to diverse variations of face appearances and shapes. Most

multi-view face detectors depend on multiple models and organize them

in parallel, pyramid or tree structure, which compromise between the

accuracy and time-cost. Aiming at a more favorable multi-view face

detector, we propose a novel funnel-structured cascade (FuSt)

detection framework. In a coarse-to-fine flavor, our FuSt consists of,

from top to bottom, 1) multiple view-specific fast LAB cascade for

extremely quick face proposal, 2) multiple coarse MLP cascade for

further candidate window verification, and 3) a unified fine MLP

cascade with shape-indexed features for accurate face detection.

Compared with other structures, on the one hand, the proposed one uses

multiple computationally efficient distributed classifiers to propose

a small number of candidate windows but with a high recall of

multi-view faces. On the other hand, by using a unified MLP cascade to

examine proposals of all views in a centralized style, it provides a

favorable solution for multi-view face detection with high accuracy

and low time-cost. Besides, the FuSt detector is alignment-aware and

performs a coarse facial part prediction which is beneficial for

subsequent face alignment. Extensive experiments on two challenging

datasets, FDDB and AFW, demonstrate the effectiveness of our FuSt

detector in both accuracy and speed.

Symmetric Positive Definite (SPD) matrices have been widely used for

data representation in many visual recognition tasks. The success

mainly attributes to learning discriminative SPD matrices with

encoding the Riemannian geometry of the underlying SPD manifold. In

this paper, we propose a geometry-aware SPD similarity learning

(SPDSL) framework to learn discriminative SPD features by directly

pursuing manifold-manifold transformation matrix of column full-rank.

Specifically, by exploiting the Riemannian geometry of the manifold of

fixed-rank Positive Semidefinite (PSD) matrices, we present a new

solution to reduce optimizing over the space of column full-rank

transformation matrices to optimizing on the PSD manifold which has a

well-established Riemannian structure. Under this solution, we exploit

a new supervised SPD similarity learning technique to learn the

transformation by regressing the similarities of selected SPD data

pairs to their ground-truth similarities on the target SPD manifold.

To optimize the proposed objective function, we further derive an

algorithm on the PSD manifold. Evaluations on three visual

classification tasks show the advantages of the proposed approach over

the existing SPD-based discriminant learning methods.

但是这时候，这些人反而变得畏首畏脚。

Smart people like Charlie Munger realize that the antidote to this sort

of mental overreaching is to add more models to your mental palette — to

expand your repertoire of ideas, making them vivid and available in the

problem-solving process.

• [cs.CV]**Joint Prediction of Depths, Normals and Surface Curvature
from RGB Images using CNNs**

*Thanuja Dharmasiri, Andrew Spek, Tom Drummond*

http://arxiv.org/abs/1706.07593v1

• [cs.CV]**Real-time Human Pose Estimation from Video with
Convolutional Neural Networks**

*Marko Linna, Juho Kannala, Esa Rahtu*

http://arxiv.org/abs/1609.07420v1

• [cs.CV]**Globally Variance-Constrained Sparse Representation for
Image Set Compression**

*Xiang Zhang, Jiarui Sun, Siwei Ma, Zhouchen Lin, Jian Zhang, Shiqi*

Wang, Wen Gao

Wang, Wen Gao

http://arxiv.org/abs/1608.04902v1

因为他们的一小举动，可能直接让他跌至谷底。

You’ll know you’re on to something when ideas start to compete with one

another — you’ll find situations where Model 1 tells you X and Model 2

tells you Y. Believe it or not, this is a sign that you’re on the right

track. Letting the models compete and fight for superiority and greater

fundamental truth is what good thinking is all about! It’s hard work,

but that’s the only way to get the right answers.

Understanding the 3D structure of a scene is of vital importance, when

it comes to developing fully autonomous robots. To this end, we

present a novel deep learning based framework that estimates depth,

surface normals and surface curvature by only using a single RGB

image. To the best of our knowledge this is the first work to estimate

surface curvature from colour using a machine learning approach.

Additionally, we demonstrate that by tuning the network to infer well

designed features, such as surface curvature, we can achieve improved

performance at estimating depth and normals.This indicates that

network guidance is still a useful aspect of designing and training a

neural network. We run extensive experiments where the network is

trained to infer different tasks while the model capacity is kept

constant resulting in different feature maps based on the tasks at

hand. We outperform the previous state-of-the-art benchmarks which

jointly estimate depths and surface normals while predicting surface

curvature in parallel.

In this paper, we present a method for real-time multi-person human

pose estimation from video by utilizing convolutional neural networks.

Our method is aimed for use case specific applications, where good

accuracy is essential and variation of the background and poses is

limited. This enables us to use a generic network architecture, which

is both accurate and fast. We divide the problem into two phases: (1)

pre-training and (2) finetuning. In pre-training, the network is

learned with highly diverse input data from publicly available

datasets, while in finetuning we train with application specific data,

which we record with Kinect. Our method differs from most of the

state-of-the-art methods in that we consider the whole system,

including person detector, pose estimator and an automatic way to

record application specific training material for finetuning. Our

method is considerably faster than many of the state-of-the-art

methods. Our method can be thought of as a replacement for Kinect, and

it can be used for higher level tasks, such as gesture control, games,

person tracking, action recognition and action tracking. We achieved

accuracy of 96.8% (PCK@0.2) with application specific data.

Sparse representation presents an efficient approach to approximately

recover a signal by the linear composition of a few bases from a

learnt dictionary, based on which various successful applications have

been observed. However, in the scenario of data compression, its

efficiency and popularity are hindered due to the extra overhead for

encoding the sparse coefficients. Therefore, how to establish an

accurate rate model in sparse coding and dictionary learning becomes

meaningful, which has been not fully exploited in the context of

sparse representation. According to the Shannon entropy inequality,

the variance of data source bounds its entropy, which can reflect the

actual coding bits. Hence, in this work a Globally

Variance-Constrained Sparse Representation (GVCSR) model is proposed,

where a variance-constrained rate model is introduced in the

optimization process. Specifically, we employ the Alternating

Direction Method of Multipliers (ADMM) to solve the non-convex

optimization problem for sparse coding and dictionary learning, both

of which have shown state-of-the-art performance in image

representation. Furthermore, we investigate the potential of GVCSR in

practical image set compression, where a common dictionary is trained

by several key images to represent the whole image set. Experimental

results have demonstrated significant performance improvements against

the most popular image codecs including JPEG and JPEG2000.

就算开上了每个人梦中的法拉利，住上汤臣一品，

It’s a little like learning to walk or ride a bike; at first, you can’t

believe how much you’re supposed to do all at once, but eventually, you

wonder how you ever didn’t know how to do it.

• [cs.CV]**Listen to Your Face: Inferring Facial Action Units from
Audio Channel**

*Zibo Meng, Shizhong Han, Yan Tong*

http://arxiv.org/abs/1706.07536v1

• [cs.CV]**The face-space duality hypothesis: a computational
model**

*Jonathan Vitale, Mary-Anne Williams, Benjamin Johnston*

http://arxiv.org/abs/1609.07371v1

• [cs.CV]**Large Angle based Skeleton Extraction for 3D Animation**

*Hugo Martin, Raphael Fernandez, Yong Khoo*

http://arxiv.org/abs/1608.05045v1

不过都是为成功而存在。

As Charlie Munger likes to say, going back to any other method of

thinking would feel like cutting off your hands. Our experience confirms

the truth of Munger’s dictum.

Extensive efforts have been devoted to recognizing facial action units

(AUs). However, it is still challenging to recognize AUs from

spontaneous facial displays especially when they are accompanied with

speech. Different from all prior work that utilized visual

observations for facial AU recognition, this paper presents a novel

approach that recognizes speech-related AUs exclusively from audio

signals based on the fact that facial activities are highly correlated

with voice during speech. Specifically, dynamic and physiological

relationships between AUs and phonemes are modeled through a

continuous time Bayesian network (CTBN); then AU recognition is

performed by probabilistic inference via the CTBN model. A pilot

audiovisual AU-coded database has been constructed to evaluate the

proposed audio-based AU recognition framework. The database consists

of a “clean” subset with frontal and neutral faces and a challenging

subset collected with large head movements and occlusions.

Experimental results on this database show that the proposed CTBN

model achieves promising recognition performance for 7 speech-related

AUs and outperforms the state-of-the-art visual-based methods

especially for those AUs that are activated at low intensities or

“hardly visible” in the visual channel. Furthermore, the CTBN model

yields more impressive recognition performance on the challenging

subset, where the visual-based approaches suffer significantly.

Valentine’s face-space suggests that faces are represented in a

psychological multidimensional space according to their perceived

properties. However, the proposed framework was initially designed as

an account of invariant facial features only, and explanations for

dynamic features representation were neglected. In this paper we

propose, develop and evaluate a computational model for a twofold

structure of the face-space, able to unify both identity and

expression representations in a single implemented model. To capture

both invariant and dynamic facial features we introduce the face-space

duality hypothesis and subsequently validate it through a mathematical

presentation using a general approach to dimensionality reduction. Two

experiments with real facial images show that the proposed face-space:

(1) supports both identity and expression recognition, and (2) has a

twofold structure anticipated by our formal argument.

In this paper, we present a solution for arbitrary 3D character

deformation by investigating rotation angle of decomposition and

preserving the mesh topology structure. In computer graphics, skeleton

extraction and skeleton-driven animation is an active areas and gains

increasing interests from researchers. The accuracy is critical for

realistic animation and related applications. There have been

extensive studies on skeleton based 3D deformation. However for the

scenarios of large angle rotation of different body parts, it has been

relatively less addressed by the state-of-the-art, which often yield

unsatisfactory results. Besides 3D animation problems, we also notice

for many 3D skeleton detection or tracking applications from a video

or depth streams, large angle rotation is also a critical factor in

the regression accuracy and robustness. We introduced a distortion

metric function to quantify the surface curviness before and after

deformation, which is a major clue for large angle rotation detection.

The intensive experimental results show that our method is suitable

for 3D modeling, animation, skeleton based tracking applications.

可以享受这种成功，但是却不是享受自己的感动。

More About Mental Models

• [cs.CV]**Sampling Matters in Deep Embedding Learning**

*Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl*

http://arxiv.org/abs/1706.07567v1

• [cs.CY]**On the (im)possibility of fairness**

*Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian*

http://arxiv.org/abs/1609.07236v1

• [cs.CY]**Modelling Student Behavior using Granular Large Scale
Action Data from a MOOC**

*Steven Tang, Joshua C. Peterson, Zachary A. Pardos*

http://arxiv.org/abs/1608.04789v1

If failure:

What kinds of knowledge are we talking about adding to our repertoire?

Deep embeddings answer one simple question: How similar are two

images? Learning these embeddings is the bedrock of verification,

zero-shot learning, and visual search. The most prominent approaches

optimize a deep convolutional network with a suitable loss function,

such as contrastive loss or triplet loss. While a rich line of work

focuses solely on the loss functions, we show in this paper that

selecting training examples plays an equally important role. We

propose distance weighted sampling, which selects more informative and

stable examples than traditional approaches. In addition, we show that

a simple margin based loss is sufficient to outperform all other loss

functions. We evaluate our approach on the Stanford Online Products,

CAR196, and the CUB200-2011 datasets for image retrieval and

clustering, and on the LFW dataset for face verification. Our method

achieves state-of-the-art performance on all of them.

What does it mean for an algorithm to be fair? Different papers use

different notions of algorithmic fairness, and although these appear

internally consistent, they also seem mutually incompatible. We

present a mathematical setting in which the distinctions in previous

papers can be made formal. In addition to characterizing the spaces of

inputs (the “observed” space) and outputs (the “decision” space), we

introduce the notion of a construct space: a space that captures

unobservable, but meaningful variables for the prediction. We show

that in order to prove desirable properties of the entire

decision-making process, different mechanisms for fairness require

different assumptions about the nature of the mapping from construct

space to decision space. The results in this paper imply that future

treatments of algorithmic fairness should more explicitly state

assumptions about the relationship between constructs and

observations.

Digital learning environments generate a precise record of the actions

learners take as they interact with learning materials and complete

exercises towards comprehension. With this high quantity of sequential

data comes the potential to apply time series models to learn about

underlying behavioral patterns and trends that characterize successful

learning based on the granular record of student actions. There exist

several methods for looking at longitudinal, sequential data like

those recorded from learning environments. In the field of language

modelling, traditional n-gram techniques and modern recurrent neural

network (RNN) approaches have been applied to algorithmically find

structure in language and predict the next word given the previous

words in the sentence or paragraph as input. In this paper, we draw an

analogy to this work by treating student sequences of resource views

and interactions in a MOOC as the inputs and predicting students’ next

interaction as outputs. In this study, we train only on students who

received a certificate of completion. In doing so, the model could

potentially be used for recommendation of sequences eventually leading

to success, as opposed to perpetuating unproductive behavior. Given

that the MOOC used in our study had over 3,500 unique resources,

predicting the exact resource that a student will interact with next

might appear to be a difficult classification problem. We find that

simply following the syllabus (built-in structure of the course) gives

on average 23% accuracy in making this prediction, followed by the

n-gram method with 70.4%, and RNN based methods with 72.2%. This

research lays the ground work for recommendation in a MOOC and other

digital learning environments where high volumes of sequential data

exist.

以我自己为例的消极主义者清楚地认为，

It’s the big, basic ideas of all the truly fundamental academic

disciplines. The stuff you should have learned in the “101” course of

each major subject but probably didn’t. These are the true general

principles that underlie most of what’s going on in the world.

• [cs.CV]**Training Adversarial Discriminators for Cross-channel
Abnormal Event Detection in Crowds**

*Mahdyar Ravanbakhsh, Enver Sangineto, Moin Nabi, Nicu Sebe*

http://arxiv.org/abs/1706.07680v1

• [cs.CY]**Tracking the Trackers: Towards Understanding the Mobile
Advertising and Tracking Ecosystem**

*Narseo Vallina-Rodriguez, Srikanth Sundaresan, Abbas Razaghpanah,*

Rishab Nithyanand, Mark Allman, Christian Kreibich, Phillipa Gill

Rishab Nithyanand, Mark Allman, Christian Kreibich, Phillipa Gill

http://arxiv.org/abs/1609.07190v1

• [cs.DC]**Safe Serializable Secure Scheduling: Transactions and the
Trade-off Between Security and Consistency**

*Isaac Sheff, Tom Magrino, Jed Liu, Andrew C. Myers, Robbert van*

Renesse

Renesse

http://arxiv.org/abs/1608.04841v1

大部分的我们，都是失败者。

Things like: The main laws of physics. The main ideas driving chemistry.

The big, useful tools of mathematics. The guiding principles of biology.

The hugely useful concepts from human psychology. The central principles

of systems thinking. The working concepts behind business and markets.

Abnormal crowd behaviour detection attracts a large interest due to

its importance in video surveillance scenarios. However, the ambiguity

and the lack of sufficient “abnormal” ground truth data makes

end-to-end training of large deep networks hard in this domain. In

this paper we propose to use Generative Adversarial Nets (GANs), which

are trained to generate only the “normal” distribution of the data.

During the adversarial GAN training, a discriminator “D” is used as a

supervisor for the generator network “G” and vice versa. At testing

time we use “D” to solve our discriminative task (abnormality

detection), where “D” has been trained without the need of

manually-annotated abnormal data. Moreover, in order to prevent “G”

learn a trivial identity function, we use a cross-channel approach,

forcing “G” to transform raw-pixel data in motion information and vice

versa. The quantitative results on standard benchmarks show that our

method outperforms previous state-of-the-art methods in both the

frame-level and the pixel-level evaluation.

Third-party services form an integral part of the mobile ecosystem:

they allow app developers to add features such as performance

analytics and social network integration, and to monetize their apps

by enabling user tracking and targeted ad delivery. At present users,

researchers, and regulators all have at best limited understanding of

this third-party ecosystem. In this paper we seek to shrink this gap.

Using data from users of our ICSI Haystack app we gain a rich view of

the mobile ecosystem: we identify and characterize domains associated

with mobile advertising and user tracking, thereby taking an important

step towards greater transparency. We furthermore outline our steps

towards a public catalog and census of analytics services, their

behavior, their personal data collection processes, and their use

across mobile apps.

Modern applications often operate on data in multiple administrative

domains. In this federated setting, participants may not fully trust

each other. These distributed applications use transactions as a core

mechanism for ensuring reliability and consistency with persistent

data. However, the coordination mechanisms needed for transactions can

both leak confidential information and allow unauthorized influence.

By implementing a simple attack, we show these side channels can be

exploited. However, our focus is on preventing such attacks. We

explore secure scheduling of atomic, serializable transactions in a

federated setting. While we prove that no protocol can guarantee

security and liveness in all settings, we establish conditions for

sets of transactions that can safely complete under secure scheduling.

Based on these conditions, we introduce staged commit, a secure

scheduling protocol for federated transactions. This protocol avoids

insecure information channels by dividing transactions into distinct

stages. We implement a compiler that statically checks code to ensure

it meets our conditions, and a system that schedules these

transactions using the staged commit protocol. Experiments on this

implementation demonstrate that realistic federated transactions can

be scheduled securely, atomically, and efficiently.

因为我们的存在，才有了成功者的比较。

These are the winning ideas. For all of the “bestselling” crap that is

touted as the new thing each year, there is almost certainly a bigger,

more fundamental, and more broadly applicable underlying idea that we

already knew about! The “new idea” is thus an application of old ideas,

packaged into a new format.

• [cs.CY]**Computational Controversy**

*Benjamin Timmermans, Tobias Kuhn, Kaspar Beelen, Lora Aroyo*

http://arxiv.org/abs/1706.07643v1

• [cs.DC]**MPI Parallelization of the Resistive Wall Code STARWALL:
Report of the EUROfusion High Level Support Team Project JORSTAR**

*S. Mochalskyy, M. Hoelzl, R. Hatzky*

http://arxiv.org/abs/1609.07441v1

• [cs.DC]**The BioDynaMo Project: Creating a Platform for Large-Scale
Reproducible Biological Simulations**

*Lukas Breitwieser, Roman Bauer, Alberto Di Meglio, Leonard Johard,*

Marcus Kaiser, Marco Manca, Manuel Mazzara, Fons Rademakers, Max

Talanov

Marcus Kaiser, Marco Manca, Manuel Mazzara, Fons Rademakers, Max

Talanov

http://arxiv.org/abs/1608.04967v1

而我们抬头看着他们，为了变成他们，而忘记了我们自己。

Yet we tend to spend the majority of time keeping up with the “new” at

the expense of learning the “old”! This is truly nuts.

Climate change, vaccination, abortion, Trump: Many topics are

surrounded by fierce controversies. The nature of such heated debates

and their elements have been studied extensively in the social science

literature. More recently, various computational approaches to

controversy analysis have appeared, using new data sources such as

Wikipedia, which help us now better understand these phenomena.

However, compared to what social sciences have discovered about such

debates, the existing computational approaches mostly focus on just a

few of the many important aspects around the concept of controversies.

In order to link the two strands, we provide and evaluate here a

controversy model that is both, rooted in the findings of the social

science literature and at the same time strongly linked to

computational methods. We show how this model can lead to

computational controversy analytics that have full coverage over all

the crucial aspects that make up a controversy.

Large scale plasma instabilities inside a tokamak can be influenced by

the currents flowing in the conducting vessel wall. This involves non

linear plasma dynamics and its interaction with the wall current. In

order to study this problem the code that solves the

magneto-hydrodynamic (MHD) equations, called JOREK, was coupled with

the model for the vacuum region and the resistive conducting structure

named STARWALL. The JOREK-STARWALL model has been already applied to

perform simulations of the Vertical Displacement Events (VDEs), the

Resistive Wall Modes (RWMs), and Quiescent H-Mode. At the beginning of

the project it was not possible to resolve the realistic wall

structure with a large number of finite element triangles due to the

huge consumption of memory and wall clock time by STARWALL and the

corresponding coupling routine in JOREK. Moreover, both the STARWALL

code and the JOREK coupling routine are only partially parallelized

via OpenMP. The aim of this project is to implement an MPI

parallelization in the model that should allow to obtain realistic

results with high resolution. This project concentrates on the MPI

parallelization of STARWALL. Parallel I/O and the MPI parallelization

of the coupling terms inside JOREK will be addressed in a follow-up

project.

Computer simulations have become a very powerful tool for scientific

research. In order to facilitate research in computational biology,

the BioDynaMo project aims at a general platform for biological

computer simulations, which should be executable on hybrid cloud

computing systems. This paper describes challenges and lessons learnt

during the early stages of the software development process, in the

context of implementation issues and the international nature of the

collaboration.

都说男人两个爱好：有钱的玩车，没钱的玩相机。

The mental-models approach inverts the process to the way it should be:

learning the Big Stuff deeply and then using that powerful database

every single day.

• [cs.CY]**Human decisions in moral dilemmas are largely described by
Utilitarianism: virtual car driving study provides guidelines for
ADVs**

*Maximilian Alexander Wächter, Anja Faulhaber, Felix Blind, Silja Timm,*

Anke Dittmer, Leon René Sütfeld, Achim Stephan, Gordon Pipa, Peter

König

Anke Dittmer, Leon René Sütfeld, Achim Stephan, Gordon Pipa, Peter

König

http://arxiv.org/abs/1706.07332v2

• [cs.DL]**OCR++: A Robust Framework For Information Extraction from
Scholarly Articles**

*Mayank Singh, Barnopriyo Barua, Priyank Palod, Manvi Garg, Sidhartha*

Satapathy, Samuel Bushi, Kumar Ayush, Krishna Sai Rohith, Tulasi Gamidi,

Pawan Goyal, Animesh Mukherjee

Satapathy, Samuel Bushi, Kumar Ayush, Krishna Sai Rohith, Tulasi Gamidi,

Pawan Goyal, Animesh Mukherjee

http://arxiv.org/abs/1609.06423v3

• [cs.DL]**Anomalies in the peer-review system: A case study of the
journal of High Energy Physics**

*Sandipan Sikdar, Matteo Marsili, Niloy Ganguly, Animesh Mukherjee*

http://arxiv.org/abs/1608.04875v1

我当然也跌入到了后者一档。

The overarching goal is to build a powerful “tree” of the mind with

strong and deep roots, a massive trunk, and lots of sturdy branches. We

use this tree to hang the “leaves” of experience we acquire, directly

and vicariously, throughout our lifetimes: the scenarios, decisions,

problems, and solutions arising in any human life.

Ethical thought experiments such as the trolley dilemma have been

investigated extensively in the past, showing that humans act in a

utilitarian way, trying to cause as little overall damage as possible.

These trolley dilemmas have gained renewed attention over the past

years; especially due to the necessity of implementing moral decisions

in autonomous driving vehicles. We conducted a set of experiments in

which participants experienced modified trolley dilemmas as the driver

in a virtual reality environment. Participants had to make

decisionsbetween two discrete options: driving on one of two lanes

where different obstacles came into view. Obstacles included a variety

of human-like avatars of different ages and group sizes. Furthermore,

we tested the influence of a sidewalk as a potential safe harbor and a

condition implicating a self-sacrifice. Results showed that subjects,

in general, decided in a utilitarian manner, sparing the highest

number of avatars possible with a limited influence of the other

variables. Our findings support that human behavior is in line with

the utilitarian approach to moral decision making. This may serve as a

guideline for the implementation of moral decisions in ADVs.

This paper proposes OCR++, an open-source framework designed for a

variety of information extraction tasks from scholarly articles

including metadata (title, author names, affiliation and e-mail),

structure (section headings and body text, table and figure headings,

URLs and footnotes) and bibliography (citation instances and

references). We analyze a diverse set of scientific articles written

in English language to understand generic writing patterns and

formulate rules to develop this hybrid framework. Extensive

evaluations show that the proposed framework outperforms the existing

state-of-the-art tools with huge margin in structural information

extraction along with improved performance in metadata and

bibliography extraction tasks, both in terms of accuracy (around 50%

improvement) and processing time (around 52% improvement). A user

experience study conducted with the help of 30 researchers reveals

that the researchers found this system to be very helpful. As an

additional objective, we discuss two novel use cases including

automatically extracting links to public datasets from the

proceedings, which would further accelerate the advancement in digital

libraries. The result of the framework can be exported as a whole into

structured TEI-encoded documents. Our framework is accessible online

at

http://cnergres.iitkgp.ac.in/OCR++/home/.

Peer-review system has long been relied upon for bringing quality

research to the notice of the scientific community and also preventing

flawed research from entering into the literature. The need for the

peer-review system has often been debated as in numerous cases it has

failed in its task and in most of these cases editors and the

reviewers were thought to be responsible for not being able to

correctly judge the quality of the work. This raises a question “Can

the peer-review system be improved?” Since editors and reviewers are

the most important pillars of a reviewing system, we in this work,

attempt to address a related question – given the editing/reviewing

history of the editors or re- viewers “can we identify the

under-performing ones?”, with citations received by the

edited/reviewed papers being used as proxy for quantifying

performance. We term such review- ers and editors as anomalous and we

believe identifying and removing them shall improve the performance of

the peer- review system. Using a massive dataset of Journal of High

Energy Physics (JHEP) consisting of 29k papers submitted between 1997

and 2015 with 95 editors and 4035 reviewers and their review history,

we identify several factors which point to anomalous behavior of

referees and editors. In fact the anomalous editors and reviewers

account for 26.8% and 14.5% of the total editors and reviewers

respectively and for most of these anomalous reviewers the performance

degrades alarmingly over time.

可是玩相机的人中，还有多少是为了单纯地拍出相片呢。

Now, let’s start by summarizing the models we’ve found useful. To

explore them in more depth, click the links provided below.

• [cs.CY]**Mediated behavioural change in human-machine networks:
exploring network characteristics, trust and motivation**

*Paul Walland, J. Brian Pickering*

http://arxiv.org/abs/1706.07597v1

• [cs.DS]**Scheduling Under Power and Energy Constraints**

*Mohammed Haroon Dupty, Pragati Agrawal, Shrisha Rao*

http://arxiv.org/abs/1609.07354v1

• [cs.DS]**Faster Sublinear Algorithms using Conditional Sampling**

*Themistoklis Gouleakis, Christos Tzamos, Manolis Zampetakis*

http://arxiv.org/abs/1608.04759v1

用索尼的单反出门不敢和人打招呼，因为C和N的口水都足以淹死我。

And remember: Building your latticework is a lifelong project. Stick

with it, and you’ll find that your ability to understand reality, make

consistently good decisions, and help those you love will always be

improving.

Human-machine networks pervade much of contemporary life. Network

change is the product of structural modifications along with

differences in participant be-havior. If we assume that behavioural

change in a human-machine network is the result of changing the

attitudes of participants in the network, then the question arises

whether network structure can affect participant attitude. Taking

citizen par-ticipation as an example, engagement with relevant

stakeholders reveals trust and motivation to be the major objectives

for the network. Using a typology to de-scribe network state based on

multiple characteristic or dimensions, we can pre-dict possible

behavioural outcomes in the network. However, this has to be medi-ated

via attitude change. Motivation for the citizen participation network

can only increase in line with enhanced trust. The focus for changing

network dynamics, therefore, shifts to the dimensional changes needed

to encourage increased trust. It turns out that the coordinated

manipulation of multiple dimensions is needed to bring about the

desired shift in attitude.

Given a system model where machines have distinct speeds and power

ratings but are otherwise compatible, we consider various problems of

scheduling under resource constraints on the system which place the

restriction that not all machines can be run at once. These can be

power, energy, or makespan constraints on the system. Given such

constraints, there are problems with divisible as well as

non-divisible jobs. In the setting where there is a constraint on

power, we show that the problem of minimizing makespan for a set of

divisible jobs is NP-hard by reduction to the knapsack problem. We

then show that scheduling to minimize energy with power constraints is

also NP-hard. We then consider scheduling with energy and makespan

constraints with divisible jobs and show that these can be solved in

polynomial time, and the problems with non-divisible jobs are NP-hard.

We give exact and approximation algorithms for these problems as

required.

A conditional sampling oracle for a probability distribution D returns

samples from the conditional distribution of D restricted to a

specified subset of the domain. A recent line of work (Chakraborty et

al. 2013 and Cannone et al. 2014) has shown that having access to such

a conditional sampling oracle requires only polylogarithmic or even

constant number of samples to solve distribution testing problems like

identity and uniformity. This significantly improves over the standard

sampling model where polynomially many samples are necessary. Inspired

by these results, we introduce a computational model based on

conditional sampling to develop sublinear algorithms with

exponentially faster runtimes compared to standard sublinear

algorithms. We focus on geometric optimization problems over points in

high dimensional Euclidean space. Access to these points is provided

via a conditional sampling oracle that takes as input a succinct

representation of a subset of the domain and outputs a uniformly

random point in that subset. We study two well studied problems:

k-means clustering and estimating the weight of the minimum spanning

tree. In contrast to prior algorithms for the classic model, our

algorithms have time, space and sample complexity that is polynomial

in the dimension and polylogarithmic in the number of points. Finally,

we comment on the applicability of the model and compare with existing

ones like streaming, parallel and distributed computational models.

逢人指教相机也不会推荐索尼，自己拍得爽就足够了。

The Farnam Street Latticework of Mental Models

• [cs.DC]**Heterogeneous MPSoCs for Mixed Criticality Systems:
Challenges and Opportunities**

*Mohamed Hassan*

http://arxiv.org/abs/1706.07429v1

• [cs.LG]**A Novel Progressive Multi-label Classifier for
Classincremental Data**

*Mihika Dave, Sahil Tapiawala, Meng Joo Er, Rajasekar Venkatesan*

http://arxiv.org/abs/1609.07215v1

• [cs.DS]**Lecture Notes on Spectral Graph Methods**

*Michael W. Mahoney*

http://arxiv.org/abs/1608.04845v1

看到其他二十周岁男青年，有钱有车有狗有5DII，什么都不缺的生活。

Mental Models — How to Solve Problems

Due to their cost, performance, area, and energy efficiency, MPSoCs

offer appealing architecture for emerging mixed criticality systems

(MCS) such as driverless cars, smart power grids, and healthcare

devices. Furthermore, heterogeneity of MPSoCs presents exceptional

opportunities to satisfy the conflicting requirements of MCS. Seizing

these opportunities is unattainable without addressing the associated

challenges. We focus on four aspects of MCS that we believe are of

most importance upon adopting MPSoCs: theoretical model, interference,

data sharing, and security. We outline existing solutions, highlight

the necessary considerations for MPSoCs including both opportunities

they create and research directions yet to be explored.

In this paper, a progressive learning algorithm for multi-label

classification to learn new labels while retaining the knowledge of

previous labels is designed. New output neurons corresponding to new

labels are added and the neural network connections and parameters are

automatically restructured as if the label has been introduced from

the beginning. This work is the first of the kind in multi-label

classifier for class-incremental learning. It is useful for real-world

applications such as robotics where streaming data are available and

the number of labels is often unknown. Based on the Extreme Learning

Machine framework, a novel universal classifier with plug and play

capabilities for progressive multi-label classification is developed.

Experimental results on various benchmark synthetic and real datasets

validate the efficiency and effectiveness of our proposed algorithm.

These are lecture notes that are based on the lectures from a class I

taught on the topic of Spectral Graph Methods at UC Berkeley during

the Spring 2015 semester.

咽咽口水，继续过。

General Thinking Concepts (11)

• [cs.DC]**Interoperable Convergence of Storage, Networking and
Computation**

*Micah Beck, Terry Moore, Piotr Luszczek*

http://arxiv.org/abs/1706.07519v1

• [cs.LG]**Multilayer Spectral Graph Clustering via Convex Layer
Aggregation**

*Pin-Yu Chen, Alfred O. Hero III*

http://arxiv.org/abs/1609.07200v1

• [cs.IT]**Hard Clusters Maximize Mutual Information**

*Bernhard C. Geiger, Rana Ali Amjad*

http://arxiv.org/abs/1608.04872v1

我不认为向他们看齐能改变什么。

- Inversion

In every form of digital store-and-forward communication, intermediate

forwarding nodes are computers, with attendant memory and processing

resources. This has inevitably given rise to efforts to create a wide

area infrastructure that goes beyond simple store and forward, a

facility that makes more general and varied use of the potential of

this collection of increasingly powerful nodes. Historically, efforts

in this direction predate the advent of globally routed packet

networking. The desire for a converged infrastructure of this kind has

only intensified over the last 30 years, as memory, storage and

processing resources have both increased in density and speed and

decreased in cost. Although there seems to be a general consensus that

it should be possible to define and deploy such a dramatically more

capable wide area facility, a great deal of investment in research

prototypes has yet to produce a credible candidate architecture.

Drawing on technical analysis, historical examples, and case studies,

we present a argument for the hypothesis that in order to realize a

distributed system with the kind of convergent generality and

deployment scalability that might qualify as “future-defining,” we

must build it up from a small set of simple, generic, and limited

abstractions of the low level processing, storage and network

resources of its intermediate nodes.

Multilayer graphs are commonly used for representing different

relations between entities and handling heterogeneous data processing

tasks. New challenges arise in multilayer graph clustering for

assigning clusters to a common multilayer node set and for combining

information from each layer. This paper presents a theoretical

framework for multilayer spectral graph clustering of the nodes via

convex layer aggregation. Under a novel multilayer signal plus noise

model, we provide a phase transition analysis that establishes the

existence of a critical value on the noise level that permits reliable

cluster separation. The analysis also specifies analytical upper and

lower bounds on the critical value, where the bounds become exact when

the clusters have identical sizes. Numerical experiments on synthetic

multilayer graphs are conducted to validate the phase transition

analysis and study the effect of layer weights and noise levels on

clustering reliability.

In this paper, we investigate mutual information as a cost function

for clustering, and show in which cases hard, i.e., deterministic,

clusters are optimal. Using convexity properties of mutual

information, we show that certain formulations of the information

bottleneck problem are solved by hard clusters. Similarly, hard

clusters are optimal for the information-theoretic co-clustering

problem that deals with simultaneous clustering of two dependent data

sets. If both data sets have to be clustered using the same cluster

assignment, hard clusters are not optimal in general. We point at

interesting and practically relevant special cases of this so-called

pairwise clustering problem, for which we can either prove or have

evidence that hard clusters are optimal. Our results thus show that

one can relax the otherwise combinatorial hard clustering problem to a

real-valued optimization problem with the same global optimum.

都说人要靠后天努力，但后天努力的方向应该是要活得更像个人，而不是活得不像个人。

Otherwise known as thinking through a situation in reverse or thinking

“backwards,” inversion is a problem-solving technique. Often by

considering what we want to avoid rather than what we want to get, we

come up with better solutions. Inversion works not just in mathematics

but in nearly every area of life. As the saying goes, “Just tell me

where I’m going to die so I can never go there.”

• [cs.DC]**Optimizing the Performance of Reactive Molecular Dynamics
Simulations for Multi-Core Architectures**

*Hasan Metin Aktulga, Christopher Knight, Paul Coffman, Kurt A. O’Hearn,*

Tzu-Ray Shan, Wei Jiang

Tzu-Ray Shan, Wei Jiang

http://arxiv.org/abs/1706.07772v1

• [cs.LG]**Using Neural Network Formalism to Solve Multiple-Instance
Problems**

*Tomas Pevny, Petr Somol*

http://arxiv.org/abs/1609.07257v1

• [cs.LG]**Application of multiview techniques to NHANES dataset**

*Aileme Omogbai*

http://arxiv.org/abs/1608.04783v1

The end of 分类讨论。

- Falsification / Confirmation Bias

http://www.ytmgym.com ，Reactive molecular dynamics simulations are computationally demanding.

Reaching spatial and temporal scales where interesting scientific

phenomena can be observed requires efficient and scalable

implementations on modern hardware. In this paper, we focus on

optimizing the performance of the widely used LAMMPS/ReaxC package for

multi-core architectures. As hybrid parallelism allows better leverage

of the increasing on-node parallelism, we adopt thread parallelism in

the construction of bonded and nonbonded lists, and in the computation

of complex ReaxFF interactions. To mitigate the I/O overheads due to

large volumes of trajectory data produced and to save users the burden

of post-processing, we also develop a novel in-situ tool for molecular

species analysis. We analyze the performance of the resulting

ReaxC-OMP package on Mira, an IBM Blue Gene/Q supercomputer. For PETN

systems of sizes ranging from 32 thousand to 16.6 million particles,

we observe speedups in the range of 1.5-4.5x. We observe sustained

performance improvements for up to 262,144 cores (1,048,576 processes)

of Mira and a weak scaling efficiency of 91.5% in large simulations

containing 16.6 million particles. The in-situ molecular species

analysis tool incurs only insignificant overheads across various

system sizes and run configurations.

Many objects in the real world are difficult to describe by a single

numerical vector of a fixed length, whereas describing them by a set

of vectors is more natural. Therefore, Multiple instance learning

(MIL) techniques have been constantly gaining on importance throughout

last years. MIL formalism represents each object (sample) by a set

(bag) of feature vectors (instances) of fixed length where knowledge

about objects (e.g., class label) is available on bag level but not

necessarily on instance level. Many standard tools including

supervised classifiers have been already adapted to MIL setting since

the problem got formalized in late nineties. In this work we propose a

neural network (NN) based formalism that intuitively bridges the gap

between MIL problem definition and the vast existing knowledge-base of

standard models and classifiers. We show that the proposed NN

formalism is effectively optimizable by a modified back-propagation

algorithm and can reveal unknown patterns inside bags. Comparison to

eight types of classifiers from the prior art on a set of 14 publicly

available benchmark datasets confirms the advantages and accuracy of

the proposed solution.

Disease prediction or classification using health datasets involve

using well-known predictors associated with the disease as features

for the models. This study considers multiple data components of an

individual’s health, using the relationship between variables to

generate features that may improve the performance of disease

classification models. In order to capture information from different

aspects of the data, this project uses a multiview learning approach,

using Canonical Correlation Analysis (CCA), a technique that finds

projections with maximum correlations between two data views. Data

categories collected from the NHANES survey (1999-2014) are used as

views to learn the multiview representations. The usefulness of the

representations is demonstrated by applying them as features in a

Diabetes classification task.

回到电影。

What a man wishes, he also believes. Similarly, what we believe is what

we choose to see. This is commonly referred to as the confirmation bias.

It is a deeply ingrained mental habit, both energy-conserving and

comfortable, to look for confirmations of long-held wisdom rather than

violations. Yet the scientific process – including hypothesis

generation, blind testing when needed, and objective statistical rigor –

is designed to root out precisely the opposite, which is why it works so

well when followed.

• [cs.DS]**Testing Piecewise Functions**

*Steve Hanneke, Liu Yang*

http://arxiv.org/abs/1706.07669v1

• [cs.MM]**Deep Quality: A Deep No-reference Quality Assessment
System**

*Prajna Paramita Dash, Akshaya Mishra, Alexander Wong*

http://arxiv.org/abs/1609.07170v1

• [cs.LG]**Dynamic Collaborative Filtering with Compound Poisson
Factorization**

*Ghassen Jerfel, Mehmet E. Basbug, Barbara E. Engelhardt*

http://arxiv.org/abs/1608.04839v1

Farhan在最后一年放弃了帝国理工的学位而成为了wildlife photographer.

The modern scientific enterprise operates under the principle of

falsification: A method is termed scientific if it can be stated in such

a way that a certain defined result would cause it to be proved false.

Pseudo-knowledge and pseudo-science operate and propagate by being

unfalsifiable – as with astrology, we are unable to prove them either

correct or incorrect because the conditions under which they would be

shown false are never stated.

This work explores the query complexity of property testing for

general piecewise functions on the real line, in the active and

passive property testing settings. The results are proven under an

abstract zero-measure crossings condition, which has as special cases

piecewise constant functions and piecewise polynomial functions. We

find that, in the active testing setting, the query complexity of

testing general piecewise functions is independent of the number of

pieces. We also identify the optimal dependence on the number of

pieces in the query complexity of passive testing in the special case

of piecewise constant functions.

Image quality assessment (IQA) continues to garner great interest in

the research community, particularly given the tremendous rise in

consumer video capture and streaming. Despite significant research

effort in IQA in the past few decades, the area of no-reference image

quality assessment remains a great challenge and is largely unsolved.

In this paper, we propose a novel no-reference image quality

assessment system called Deep Quality, which leverages the power of

deep learning to model the complex relationship between visual content

and the perceived quality. Deep Quality consists of a novel

multi-scale deep convolutional neural network, trained to learn to

assess image quality based on training samples consisting of different

distortions and degradations such as blur, Gaussian noise, and

compression artifacts. Preliminary results using the CSIQ benchmark

image quality dataset showed that Deep Quality was able to achieve

strong quality prediction performance (89% patch-level and 98%

image-level prediction accuracy), being able to achieve similar

performance as full-reference IQA methods.

Model-based collaborative filtering analyzes user-item interactions to

infer latent factors that represent user preferences and item

characteristics in order to predict future interactions. Most

collaborative filtering algorithms assume that these latent factors

are static, although it has been shown that user preferences and item

perceptions drift over time. In this paper, we propose a conjugate and

numerically stable dynamic matrix factorization (DCPF) based on

compound Poisson matrix factorization that models the smoothly

drifting latent factors using Gamma-Markov chains. We propose a

numerically stable Gamma chain construction, and then present a

stochastic variational inference approach to estimate the parameters

of our model. We apply our model to time-stamped ratings data sets:

Netflix, Yelp, and Last.fm, where DCPF achieves a higher predictive

accuracy than state-of-the-art static and dynamic factorization

models.

“挣得少一点，房子小一点，车子小一点，但我会很快乐，会真正幸福”

- Circle of Competence

• [cs.IR]**Causal Embeddings for Recommendation**

*Stephen Bonner, Flavian Vasile*

http://arxiv.org/abs/1706.07639v1

• [cs.NE]**Deep Learning in Multi-Layer Architectures of Dense
Nuclei**

*Yonghua Yin, Erol Gelenbe*

http://arxiv.org/abs/1609.07160v1

• [cs.LG]**Mollifying Networks**

*Caglar Gulcehre, Marcin Moczulski, Francesco Visin, Yoshua Bengio*

http://arxiv.org/abs/1608.04980v1

可能在实际生活中我们这些failure没有这样的魄力。

An idea introduced by Warren Buffett and Charles Munger in relation to

investing: each individual tends to have an area or areas in which they

really, truly know their stuff, their area of special competence. Areas

not inside that circle are problematic because not only are we ignorant

about them, but we may also be ignorant of our own ignorance. Thus, when

we’re making decisions, it becomes important to define and attend to our

special circle, so as to act accordingly.

Recommendations are treatments. While todays recommender systems

attempt to emulate the naturally occurring user behaviour by

predicting either missing entries in the user-item matrix or computing

the most likely continuation of user sessions, we need to start

thinking of recommendations in terms of optimal interventions with

respect to specific goals, such as the increase of number of user

conversions on a E-Commerce website. This objective is known as

Incremental Treatment Effect prediction (ITE) in the causal community.

We propose a new way of factorizing user-item matrices created from a

large sample of biased data collected using a control recommendation

policy and from limited randomized recommendation data collected using

a treatment recommendation policy in order to jointly optimize the

prediction of outcomes of the treatment policy and its incremental

treatment effect with respect to the control policy. We compare our

method against both state-of-the-art factorization methods and against

new approaches of causal recommendation and show significant

improvements in performance.

In dense clusters of neurons in nuclei, cells may interconnect via

soma-to-soma interactions, in addition to conventional synaptic

connections. We illustrate this idea with a multi-layer architecture

(MLA) composed of multiple clusters of recurrent sub-networks of

spiking Random Neural Networks (RNN) with dense soma-to-soma

interactions. We use this RNN-MLA architecture for deep learning. The

inputs to the clusters are normalised by adjusting the external

arrival rates of spikes to each cluster, and then apply this

architectures to learning from multi-channel datasets. We present

numerical results based on both images and sensor based data that show

the value of this RNN-MLA for deep learning.

The optimization of deep neural networks can be more challenging than

traditional convex optimization problems due to the highly non-convex

nature of the loss function, e.g. it can involve pathological

landscapes such as saddle-surfaces that can be difficult to escape for

algorithms based on simple gradient descent. In this paper, we attack

the problem of optimization of highly non-convex neural networks by

starting with a smoothed — or \textit{mollified} — objective

function that gradually has a more non-convex energy landscape during

the training. Our proposition is inspired by the recent studies in

continuation methods: similar to curriculum methods, we begin learning

an easier (possibly convex) objective function and let it evolve

during the training, until it eventually goes back to being the

original, difficult to optimize, objective function. The complexity of

the mollified networks is controlled by a single hyperparameter which

is annealed during the training. We show improvements on various

difficult optimization tasks and establish a relationship with recent

works on continuation methods for neural networks and mollifiers.

理想只是一个normal distribution，离开mean越远，就被压榨得越少。

- The Principle of Parsimony (Occam’s Razor)

• [cs.IR]**Comparing Neural and Attractiveness-based Visual Features
for Artwork Recommendation**

*Vicente Dominguez, Pablo Messina, Denis Parra, Domingo Mery, Christoph*

Trattner, Alvaro Soto

Trattner, Alvaro Soto

http://arxiv.org/abs/1706.07515v1

• [cs.NE]**Multi-Output Artificial Neural Network for Storm Surge
Prediction in North Carolina**

*Anton Bezuglov, Brian Blanton, Reinaldo Santiago*

http://arxiv.org/abs/1609.07378v1

• [cs.LG]**Reinforcement Learning algorithms for regret minimization
in structured Markov Decision Processes**

*K J Prabuchandran, Tejas Bodas, Theja Tulabandhula*

http://arxiv.org/abs/1608.04929v1

normal distribution 的传送门：

Named after the friar William of Ockham, Occam’s Razor is a heuristic by

which we select among competing explanations. Ockham stated that we

should prefer the simplest explanation with the least moving parts: it

is easier to falsify (see: Falsification), easier to understand, and

more likely, on average, to be correct. This principle is not an iron

law but a tendency and a mindset: If all else is equal, it’s more likely

that the simple solution suffices. Of course, we also keep in mind

Einstein’s famous idea (even if apocryphal) that “an idea should be made

as simple as possible, but no simpler.”

Advances in image processing and computer vision in the latest years

have brought about the use of visual features in artwork

recommendation. Recent works have shown that visual features obtained

from pre-trained deep neural networks (DNNs) perform very well for

recommending digital art. Other recent works have shown that explicit

visual features (EVF) based on attractiveness can perform well in

preference prediction tasks, but no previous work has compared DNN

features versus specific attractiveness-based visual features (e.g.

brightness, texture) in terms of recommendation performance. In this

work, we study and compare the performance of DNN and EVF features for

the purpose of physical artwork recommendation using transaction data

from UGallery, an online store of physical paintings. In addition, we

perform an exploratory analysis to understand if DNN embedded features

have some relation with certain EVF. Our results show that DNN

features outperform EVF, that certain EVF features are more suited for

physical artwork recommendation and, finally, we show evidence that

certain neurons in the DNN might be partially encoding visual features

such as brightness, providing an opportunity for explaining

recommendations based on visual neural models.

During hurricane seasons, emergency managers and other decision makers

need accurate and `on-time’ information on potential storm surge

impacts. Fully dynamical computer models, such as the ADCIRC tide,

storm surge, and wind-wave model take several hours to complete a

forecast when configured at high spatial resolution. Additionally,

statically meaningful ensembles of high-resolution models (needed for

uncertainty estimation) cannot easily be computed in near real-time.

This paper discusses an artificial neural network model for storm

surge prediction in North Carolina. The network model provides fast,

real-time storm surge estimates at coastal locations in North

Carolina. The paper studies the performance of the neural network

model vs. other models on synthetic and real hurricane data.

A recent goal in the Reinforcement Learning (RL) framework is to

choose a sequence of actions or a policy to maximize the reward

collected or minimize the regret incurred in a finite time horizon.

For several RL problems in operation research and optimal control, the

optimal policy of the underlying Markov Decision Process (MDP) is

characterized by a known structure. The current state of the art

algorithms do not utilize this known structure of the optimal policy

while minimizing regret. In this work, we develop new RL algorithms

that exploit the structure of the optimal policy to minimize regret.

Numerical experiments on MDPs with structured optimal policies show

that our algorithms have better performance, are easy to implement,

have a smaller run-time and require less number of random number

generations.

久而久之，我们还是回到了堆积最多的平庸之中。

- Hanlon’s Razor

• [cs.IR]**Contextual Sequence Modeling for Recommendation with
Recurrent Neural Networks**

*Elena Smirnova, Flavian Vasile*

http://arxiv.org/abs/1706.07684v1

• [cs.NI]**Hydra: Leveraging Functional Slicing for Efficient
Distributed SDN Controllers**

*Yiyang Chang, Ashkan Rezaei, Balajee Vamanan, Jahangir Hasan, Sanjay*

Rao, T. N. Vijaykumar

Rao, T. N. Vijaykumar

http://arxiv.org/abs/1609.07192v1

• [cs.MM]**Towards Music Captioning: Generating Music Playlist
Descriptions**

*Keunwoo Choi, George Fazekas, Mark Sandler*

http://arxiv.org/abs/1608.04868v1

“你们都陷入比赛中，就算你是第一，这种方式又有什么用？你的知识会增长吗？不会，增长的只有压力。这里是大学，不是高压锅。”

Harder to trace in its origin, Hanlon’s Razor states that we should not

attribute to malice that which is more easily explained by stupidity. In

a complex world, this principle helps us avoid extreme paranoia and

ideology, often very hard to escape from, by not generally assuming that

bad results are the fault of a bad actor, although they can be. More

likely, a mistake has been made.

Recommendations can greatly benefit from good representations of the

user state at recommendation time. Recent approaches that leverage

Recurrent Neural Networks (RNNs) for session-based recommendations

have shown that Deep Learning models can provide useful user

representations for recommendation. However, current RNN modeling

approaches summarize the user state by only taking into account the

sequence of items that the user has interacted with in the past,

without taking into account other essential types of context

information such as the associated types of user-item interactions,

the time gaps between events and the time of day for each interaction.

To address this, we propose a new class of Contextual Recurrent Neural

Networks for Recommendation (CRNNs) that can take into account the

contextual information both in the input and output layers and

modifying the behavior of the RNN by combining the context embedding

with the item embedding and more explicitly, in the model dynamics, by

parametrizing the hidden unit transitions as a function of context

information. We compare our CRNNs approach with RNNs and

non-sequential baselines and show good improvements on the next event

prediction task.

The conventional approach to scaling Software Defined Networking (SDN)

controllers today is to partition switches based on network topology,

with each partition being controlled by a single physical controller,

running all SDN applications. However, topological partitioning is

limited by the fact that (i) performance of latency-sensitive (e.g.,

monitoring) SDN applications associated with a given partition may be

impacted by co-located compute-intensive (e.g., route computation)

applications; (ii) simultaneously achieving low convergence time and

response times might be challenging; and (iii) communication between

instances of an application across partitions may increase latencies.

To tackle these issues, in this paper, we explore functional slicing,

a complementary approach to scaling, where multiple SDN applications

belonging to the same topological partition may be placed in

physically distinct servers. We present Hydra, a framework for

distributed SDN controllers based on functional slicing. Hydra chooses

partitions based on convergence time as the primary metric, but places

application instances across partitions in a manner that keeps

response times low while considering communication between

applications of a partition, and instances of an application across

partitions. Evaluations using the Floodlight controller show the

importance and effectiveness of Hydra in simultaneously keeping

convergence times on failures small, while sustaining higher

throughput per partition and ensuring responsiveness to

latency-sensitive applications.

Descriptions are often provided along with recommendations to help

users’ discovery. Recommending automatically generated music playlists

(e.g. personalised playlists) introduces the problem of generating

descriptions. In this paper, we propose a method for generating music

playlist descriptions, which is called as music captioning. In the

proposed method, audio content analysis and natural language

processing are adopted to utilise the information of each track.

高压锅中的我们就像是一摞一摞的杯具。

- Second-Order Thinking

• [cs.IR]**Specializing Joint Representations for the task of Product
Recommendation**

*Thomas Nedelec, Elena Smirnova, Flavian Vasile*

http://arxiv.org/abs/1706.07625v1

• [cs.SD]**Discovering Sound Concepts and Acoustic Relations In
Text**

*Anurag Kumar, Bhiksha Raj, Ndapandula Nakashole*

http://arxiv.org/abs/1609.07384v1

• [cs.NE]**Power Series Classification: A Hybrid of LSTM and a Novel
Advancing Dynamic Time Warping**

*Yuanlong Li, Han Hu, Yonggang Wen, Jun Zhang*

http://arxiv.org/abs/1608.04171v2

透明的玻璃，逐渐被水汽给覆盖蒙蔽。

In all human systems and most complex systems, the second layer of

effects often dwarfs the first layer, yet often goes unconsidered. In

other words, we must consider that effects have effects. Second-order

thinking is best illustrated by the idea of standing on your tiptoes at

a parade: Once one person does it, everyone will do it in order to see,

thus negating the first tiptoer. Now, however, the whole parade audience

suffers on their toes rather than standing firmly on their whole feet.

We propose a unified product embedded representation that is optimized

for the task of retrieval-based product recommendation. To this end,

we introduce a new way to fuse modality-specific product embeddings

into a joint product embedding, in order to leverage both product

content information, such as textual descriptions and images, and

product collaborative filtering signal. By introducing the fusion step

at the very end of our architecture, we are able to train each

modality separately, allowing us to keep a modular architecture that

is preferable in real-world recommendation deployments. We analyze our

performance on normal and hard recommendation setups such as

cold-start and cross-category recommendations and achieve good

performance on a large product shopping dataset.

In this paper we describe approaches for discovering acoustic concepts

and relations in text. The first major goal is to be able to identify

text phrases which contain a notion of audibility and can be termed as

a sound or an acoustic concept. We also propose a method to define an

acoustic scene through a set of sound concepts. We use pattern

matching and parts of speech tags to generate sound concepts from

large scale text corpora. We use dependency parsing and LSTM recurrent

neural network to predict a set of sound concepts for a given acoustic

scene. These methods are not only helpful in creating an acoustic

knowledge base but also directly help in acoustic event and scene

detection research in a variety of ways.

As many applications organize data into temporal sequences, the

problem of time series data classification has been widely studied.

Recent studies show that the 1-nearest neighbor with dynamic time

warping (1NN-DTW) and the long short term memory (LSTM) neural network

can achieve a better performance than other machine learning

algorithms. In this paper, we build a novel time series classification

algorithm hybridizing 1NN-DTW and LSTM, and apply it to a practical

data center power monitoring problem. Firstly, we define a new

distance measurement for the 1NN-DTW classifier, termed as Advancing

Dynamic Time Warping (ADTW), which is non-commutative and non-dynamic

programming. Secondly, we hybridize the 1NN-ADTW and LSTM together. In

particular, a series of auxiliary test samples generated by the linear

combination of the original test sample and its nearest neighbor with

ADTW are utilized to detect which classifier to trust in the hybrid

algorithm. Finally, using the power consumption data from a real data

center, we show that the proposed ADTW can improve the classification

accuracy from about 84% to 89%. Furthermore, with the hybrid

algorithm, the accuracy can be further improved and we achieve an

accuracy up to about 92%. Our research can inspire more studies on

non-commutative distance measurement and the hybrid of the deep

learning models with other traditional models.

在开锅的那一刻，裂口，转眼粉碎。

- The Map Is Not the Territory

• [cs.IT]**A Combinatorial Methodology for Optimizing Non-Binary
Graph-Based Codes: Theoretical Analysis and Applications in Data
Storage**

*Ahmed Hareedy, Chinmayi Lanka, Nian Guo, Lara Dolecek*

http://arxiv.org/abs/1706.07529v1

• [cs.SD]**Novel stochastic properties of the short-time spectrum for
unvoiced pronunciation modeling and synthesis**

*Xiaodong Zhuang, Nikos E. Mastorakis*

http://arxiv.org/abs/1609.07245v1

• [cs.SE]**A Proposal for the Measurement and Documentation of
Research Software Sustainability in Interactive Metadata
Repositories**

*Stephan Druskat*

http://arxiv.org/abs/1608.04529v2

但尽管如此，人们还是奋不顾身地前赴后继吧。

The map of reality is not reality itself. If any map were to represent

its actual territory with perfect fidelity, it would be the size of the

territory itself. Thus, no need for a map! This model tells us that

there will always be an imperfect relationship between reality and the

models we use to represent and understand it. This imperfection is a

necessity in order to simplify. It is all we can do to accept this and

act accordingly.

Non-binary (NB) low-density parity-check (LDPC) codes are graph-based

codes that are increasingly being considered as a powerful error

correction tool for modern dense storage devices. The increasing

levels of asymmetry incorporated by the channels underlying modern

dense storage systems exacerbates the error floor problem. In a recent

research, the weight consistency matrix (WCM) framework was introduced

as an effective NB-LDPC code optimization methodology that is suitable

for modern Flash memory and magnetic recording (MR) systems. In this

paper, we provide the in-depth theoretical analysis needed to

understand and properly apply the WCM framework. We focus on general

absorbing sets of type two (GASTs). In particular, we introduce a

novel tree representation of a GAST called the unlabeled GAST tree,

using which we prove that the WCM framework is optimal. Then, we

enumerate the WCMs. We demonstrate the significance of the savings

achieved by the WCM framework in the number of matrices processed to

remove a GAST. Moreover, we provide a linear-algebraic analysis of the

null spaces of WCMs associated with a GAST. We derive the minimum

number of edge weight changes needed to remove a GAST via its WCMs,

along with how to choose these changes. Additionally, we propose a new

set of problematic objects, namely the oscillating sets of type two

(OSTs), which contribute to the error floor of NB-LDPC codes with even

column weights on asymmetric channels, and we show how to customize

the WCM framework to remove OSTs. We also extend the domain of the WCM

framework applications by demonstrating its benefits in optimizing

column weight 5 codes, codes used over Flash channels with soft

information, and spatially-coupled codes. The performance gains

achieved via the WCM framework range between 1 and nearly 2.5 orders

of magnitude in the error floor region over interesting channels.

Stochastic property of speech signal is a fundamental research topic

in speech analysis and processing. In this paper, multiple levels of

randomness in speech signal are discussed, and the stochastic

properties of unvoiced pronunciation are studied in detail, which has

not received sufficient research attention before. The study is based

on the signals of sustained unvoiced pronunciation captured in the

experiments, for which the amplitude and phase values in the

short-time spectrum are studied as random variables. The statistics of

amplitude for each frequency component is studied individually, based

on which a new property of “consistent standard deviation coefficient”

is revealed for the amplitude spectrum of unvoiced pronunciation. The

relationship between the amplitude probability distributions of

different frequency components is further studied, which indicates

that all the frequency components have a common prototype of amplitude

probability distribution. As an adaptive and flexible probability

distribution, the Weibull distribution is adopted to fit the

expectation-normalized amplitude spectrum data. The phase distribution

for the short-time spectrum is also studied, and the results show a

uniform distribution. A synthesis method for unvoiced pronunciation is

proposed based on the Weibull distribution of amplitude and uniform

distribution of phase, which is implemented by STFT with artificially

generated short-time spectrum with random amplitude and phase. The

synthesis results have identical quality of auditory perception as the

original pronunciation, and have similar autocorrelation as that of

the original signal, which proves the effectiveness of the proposed

stochastic model of short-time spectrum for unvoiced pronunciation.

This paper proposes an interactive repository type for research

software metadata which measures and documents software sustainability

by accumulating metadata, and computing sustainability metrics over

them. Such a repository would help to overcome technical barriers to

software sustainability by furthering the discovery and identification

of sustainable software, thereby also facilitating documentation of

research software within the framework of software management plans.

请记得。“心很脆弱，你得学会去哄他，不管遇到多大困难，告诉你的心—— All is

well.”

- http://www.youminghaishen.com ，Thought Experiments

• [cs.IT]**Common-Message Broadcast Channels with Feedback in the
Nonasymptotic Regime: Full Feedback**

*Kasper Fløe Trillingsgaard, Wei Yang, Giuseppe Durisi, Petar*

Popovski

Popovski

http://arxiv.org/abs/1706.07731v1

• [http://www.osmosis3.com ，math.OC]**Screening Rules for Convex Problems**

*Anant Raj, Jakob Olbrich, Bernd Gärtner, Bernhard Schölkopf, Martin
Jaggi*

http://arxiv.org/abs/1609.07478v1

• [cs.SI]**Feature Driven and Point Process Approaches for Popularity
Prediction**

*Swapnil Mishra, Marian-Andrei Rizoiu, Lexing Xie*

http://arxiv.org/abs/1608.04862v1