Numerous sensors have been deployed in different geospatial locations to continuously and cooperatively monitor the surrounding environment, such as the air quality. These sensors generate multiple geo-sensory time series, with spatial correlations between their readings. Forecasting geo-sensory time series is of great importance yet very challenging as it is affected by many complex factors, i.e., dynamic spatio-temporal correlations and external factors. In this paper, we predict the readings of a geo-sensor over several future hours by using a multi-level attention-based recurrent neural network that considers multiple sensors' readings, meteorological data, and spatial data. More specifically, our model consists of two major parts: 1) a multi-level attention mechanism to model the dynamic spatio-temporal dependencies. 2) a general fusion module to incorporate the external factors from different domains. Experiments on two types of real-world datasets, viz., air quality data and water quality data, demonstrate that our method outperforms nine baseline methods.
The rapid development of sensor networks enables recognition of complex activities (CAs) using multivariate time series. However, CAs are usually performed over long periods of time, which causes slow recognition by models based on fully observed data. Therefore, predicting CAs at early stages becomes an important problem. In this paper, we propose Simultaneous Complex Activities Recognition and Action Sequence Discovering (SimRAD), an algorithm which predicts a CA over time by mining a sequence of multivariate actions from sensor data using a Deep Neural Network. SimRAD simultaneously learns two probabilistic models for inferring CAs and action sequences, where the estimations of the two models are conditionally dependent on each other. SimRAD continuously predicts the CA and the action sequence, thus the predictions are mutually updated until the end of the CA. We conduct evaluations on a real-world CA dataset consisting of a rich amount of sensor data, and the results show that SimRAD outperforms state-of-the-art methods by average 7.2% in prediction accuracy with high confidence.
Recent advances in video surveillance systems enable a new paradigm for intelligent urban traffic management systems. Since surveillance cameras are usually sparsely located to cover key regions of the road under surveillance, it is a big challenge to perform a complete real-time traffic pattern analysis based on incomplete sparse surveillance information. As a result, existing works mostly focus on predicting traffic volumes with historical records available at a particular location and may not provide a complete picture of real-time traffic patterns. To this end, in this paper, we go beyond existing works and tackle the challenges of traffic flow analysis from three perspectives. First, we train the transition probabilities to capture vehicles' movement patterns. The transition probabilities are trained from third-party vehicle GPS data, and thus can work in the area even if there is no camera. Second, we exploit the Multivariate Normal Distribution model together with the transferred probabilities to estimate the unobserved traffic patterns. Third, we propose an algorithm for real-time traffic inference with surveillance as a complement source of information. Finally, experiments on real-world data show the effectiveness of our approach.
A continuous-time tensor factorization method is developed for event sequences containing multiple "modalities." Each data element is a point in a tensor, whose dimensions are associated with the discrete alphabet of the modalities. Each tensor data element has an associated time of occurence and a feature vector. We model such data based on pairwise interactive point processes, and the proposed framework connects pairwise tensor factorization with a feature-embedded point process. The model accounts for interactions within each modality, interactions across different modalities, and continuous-time dynamics of the interactions. Model learning is formulated as a convex optimization problem, based on online alternating direction method of multipliers. Compared to existing state-of-the-art methods, our approach captures the latent structure of the tensor and its evolution over time, obtaining superior results on real-world datasets.
Embeddings of medical concepts such as medica- tion, procedure and diagnosis codes in Electronic Medical Records (EMRs) are central to health- care analytics. Previous work on medical concept embedding takes medical concepts and EMRs as words and documents respectively. Nevertheless, such models miss out the temporal nature of EMR data. On the one hand, two consecutive medical concepts do not indicate they are temporally close, but the correlations between them can be revealed by the time gap. On the other hand, the temporal scopes of medical concepts often vary greatly (e.g., common cold and diabetes). In this paper, we pro- pose to incorporate the temporal information to em- bed medical codes. Based on the Continuous Bag- of-Words model, we employ the attention mecha- nism to learn a “soft” time-aware context window for each medical concept. Experiments on pub- lic and proprietary datasets through clustering and nearest neighbour search tasks demonstrate the ef- fectiveness of our model, showing that it outper- forms five state-of-the-art baselines.
Causal inference in time series is an important problem in many fields. Traditional methods use regression models for this problem. The inference accuracies of these methods depend greatly on whether or not the model can be well fitted to the data, and therefore we are required to select an appropriate regression model, which is difficult in practice. This paper proposes a supervised learning framework that utilizes a classifier instead of regression models. We present a feature representation that employs the distance between the conditional distributions given past variable values and show experimentally that the feature representation provides sufficiently different feature vectors for time series with different causal relationships. Furthermore, we extend our framework to multivariate time series and present experimental results where our method outperformed the model-based methods and the supervised learning method for i.i.d. data.
Evolutionary algorithms (EAs) have been widely applied to solve multi-objective optimization problems. In contrast to great practical successes, their theoretical foundations are much less developed, even for the essential theoretical aspect, i.e., running time analysis. In this paper, we propose a general approach to estimating upper bounds on the expected running time of multi-objective EAs (MOEAs), and then apply it to diverse situations, including bi-objective and many-objective optimization as well as exact and approximate analysis. For some known asymptotic bounds, our analysis not only provides their leading constants, but also improves them asymptotically. Moreover, our results provide some theoretical justification for the good empirical performance of MOEAs in solving multi-objective combinatorial problems.
Hierarchical Electricity Time Series Forecasting for Integrating Consumption Patterns Analysis and Aggregation Consistency
Electricity demand forecasting is a very important problem for energy supply and environmental protection. It can be formalized as a hierarchical time series forecasting problem with the aggregation constraints according to the geographical hierarchy, since the sum of the prediction results of the disaggregated time series should be equal to the prediction results of the aggregated ones. However in most previous work, the aggregation consistency is ensured at the loss of forecast accuracy. In this paper, we propose a novel clustering-based hierarchical electricity time series forecasting approach. Instead of dealing with the geographical hierarchy directly, we explore electricity consumption patterns by clustering analysis and build a new consumption pattern based time series hierarchy. We then present a novel hierarchical forecasting method with consumption hierarchical aggregation constraints to improve the electricity demand predictions of the bottom level, followed by a bottom-up" method to obtain forecasts of the geographical higher levels. Especially, we observe that in our consumption pattern based hierarchy the reconciliation error of the bottom level time series is
correlated" to its membership degree of the corresponding cluster (consumption pattern), and hence apply this correlations as the regularization term in our forecasting objective function. Extensive experiments on real-life datasets verify that our approach achieves the best prediction accuracy, compared with the state-of-the-art methods.
In the smart power grid, short-term load forecasting (STLF) is a crucial step in scheduling and planning for future load, so as to improve the reliability, cost, and emissions of the power grid. Different from traditional time series forecast, STLF is a more challenging task, because of the complex demand of active and reactive power from numerous categories of electrical loads and the effects of environment. Therefore, we propose NeuCast, a seasonal neural forecasting method, which dynamically models various loads as co-evolving time series in a hidden space, as well as extra weather conditions, in a neural network structure. NeuCast captures seasonality and patterns of the time series by integrating factor modeling and hidden state recognition. NeuCast can also detect anomalies and forecast under different temperature assumptions. Extensive experiments on 134 real-word datasets show the improvements of NeuCast over the stateof-the-art methods.