9:00-9:30 | Welcome and opening words | |||||
9:30-10:30 |
3d-, manifold-valued point processes
3d-, manifold-valued point processes Rasmus Waagepetersen
A $K-$function is presented for assessing second-order properties of inhomogeneous fiber patterns generated by marked point processes. The $K-$function takes into account the geometric features of the fibers, such as tangent directions. The $K-$function requires an estimate of the inhomogeneous density function of the fiber pattern. We introduce parametric estimates for the density function based on parametric models that represent large-scale features of the inhomogeneous fiber pattern. The proposed methodology is applied to simulated fiber patterns as well as a three-dimensional data set of steel fibers in concrete. A $K-$function for inhomogeneous fiber patterns
Ed Cohen, Scott Ward, Niall Adams and Heather Battey
The development of statistical methods designed for analysing spatial point patterns has typically focused on Euclidean data and planar surfaces. However, with recent advances in 3D biological imaging technologies targeting protein molecules on cellular membranes, spatial point patterns are now being observed on complex shapes and manifolds whose geometry must be respected. Consequently, there is now a demand for tools that can analyse these data for important scientific studies in cellular and micro-biology. For this purpose, we extend the classical functional summary statistics for spatial point patterns to general convex bounded shapes. Using the Mapping Theorem, a Poisson process can be transformed from any convex shape to a Poisson process on the unit sphere where existing theory and methods can be leveraged. We present the first and second order properties of such summary statistics and demonstrate how they can be used to construct test statistics to determine whether an observed pattern exhibits complete spatial randomness on the original convex space.
We will then present recent work in which we develop the kernel intensity estimator for manifolds. New statistical methods for analysing spatial point patterns on 3D shapes and manifolds
| |||||
10:30-11:00 | Coffee break with fruits | |||||
11:00-12:00 |
Marked point processes
Marked point processes Matthias Eckardt
The analysis of more complex spatial point process scenarios has gained immense attraction within the last
decade. While there has been impressive progress in the theory and methodology of (marked) spatial point processes with points
on e.g. nets, Euclidean graphs or the sphere, only a few extensions to the non-scalar (non-Euclidean) mark setting extist. In particular, apart from
function-valued marks, the treatment of more general object-valued marks has just been introduced. This talk addresses a particular type of object-valued marks and
focuses on the case where each point is augmented by a density-valued quantity. References: [1] Matthias Eckardt, Sonja Greven and Mari Myllymäki (2023): Composition-valued marked spatial point processes, Working paper [2] Matthias Eckardt, Carles Comas, Jorge Mateu (2023): Summary characteristics for multivariate function-valued spatial point process attributes, Working paper [3] Matthias Eckardt, Alessandra Menafoglio and Sonja Greven (2023): Spatial point processes with density-valued marks, Working paper On spatial point processes with density-valued marks
Robin Markwitz, Marie-Colette van Lieshout
Aoristic data are interval-censored data
depicting time spans in which an event
occurred with certainty, but where the exact time of occurrence is unknown. To infer the occurrence times, we implement a
Bayesian approach. For the forward model, we use a
marked point process, with the marks representing
the censored intervals. The prior is a Markov point
process. Previously, approaches operate under the assumption that censoring occurs
homogeneously in time when modelling aoristic
data. In this work, we introduce a
non-homogeneous, semi-Markov-inspired censoring
mechanism that is able to capture seasonal
variations. Inferring occurrence times from interval-censored data: a non-homogeneous semi-Markov approach
| |||||
12:00-13:30 | Lunch [espace convivialité] | |||||
14:00-15:00 |
(semi and non )parametric methodologies
(semi and non )parametric methodologies Francisco Cuevas-Pacheco, Jean-François Coeurjolly, Marie-Hélène Descary
Estimating the first-order intensity function in point pattern analysis is an important problem, and it has been approached so far from different perspectives. Motivated by eye-movement data, we introduce a convolution type model where the log-intensity is modelled as the convolution of a function $\beta(\cdot)$, to be estimated, and a single spatial covariate (the image an individual is looking at for eye-movement data). Based on a Fourier series expansion, we show that the proposed model can be viewed as a log-linear model with an infinite number of coefficients, which correspond to the spectral decomposition of $\beta(\cdot)$. After truncation, we estimate these coefficients through a penalized Poisson likelihood. A convolution type model for the intensity of spatial point processes applied to eye-movement data
Radu S. Stoica
Parameter estimation for point processes is achieved via solving optimisation problems built using general strategies. Three well established strategies are enumerated. The first consists of considering contrast fuctions based on summary statistics. The second one uses the pseudo-likelihood. And the third approximates the likelihood function via Monte Carlo procedures. Each of these techniques has known advantages and drawbacks (Moler and Waagepetersen 2004, van Lieshout 2001, 2019).
Sampling point process posterior densities is an inference approach deeply intertwinned wih the previous one, since it allows simultaneous parameter estimation and statistical tests based on observations. The auxiliary variable method (Moller et al.,2006) gives the mathematical solution to this problem, while pointing out the difficulties of its practical implementation due to poor mixing. The exchange algorithm proposed by (Murray et al. 2006), (Caimo and Friel, 2011) proposes a solution for the poor mixing induced by the auxiliary variable method. As its predecessor it requires exact simulation for the sampling of the auxiliary variable. This is not really a drawback, but it may explode the computational time for models exhibiting strong interactions (van Lieshout and Stoica, 2006).
This talk presents the approximate ABC Shadow and SSA methods as complementary inference methods to the ones based on posterior density sampling. These methods do not require exact simulation, while providing the necessary theoretical control. The derived algorithms are applied on data from several application domains such as astronomy, geosciences and network sciences (Stoica et al.,17), (Stoica et al.,21), (Hurtado et al.,21), (Laporte et al.,22). Approximated inference for marked Gibbs point process
| |||||
15:00-15:30 | Coffee break with fruits | |||||
15:30-17:30 |
Spatio-temporal models
Spatio-temporal models Jean-François Coeurjolly, Thibault Espinasse and Anne-Laure Fougères
While modeling the spatio-temporal intensity of the point pattern formed by lightning strikes in France over the period 2011-2015, we were faced with numerical problems which can be explained with the following facts: the number of impacts is huge, the spatial and spatio-temporal covariates are constant over a very fine spatial or spatio-temporal grid and last but not least the number of (spatio-temporal) cells over which no lighning strike is observed is huge. In this talk, I'll revisit standard parametric methods with constant covariates and will present an extension of these using zero-deflated subsampling to handle the excess of zeroes.
Spatio-temporal point process intensity estimation using zero-deflated subsampling
Frédéric Lavancier, Lisa Balsollier
Birth-death-move processes with mutations are models for the spatio-temporal dynamics of a system of marked particles in motion where births, deaths and mutations can occur. We consider in this talk the parametric estimation of such processes. We obtain the expression of the likelihood function and we prove that under some regularity conditions, the model satisfies the LAN (local asymptotic normality) property, implying the optimality of regular estimators. We apply this result to the maximum likelihood estimation of parameters involved in the birth kernel of the process. After a short simulation study that confirms our theoretical findings, we analyse a real dataset representing the spatio-temporel dynamics of biomolecules observed in a living cell. Our estimation reveals and quantifies a colocalisation phenomenon between two types of proteins involved in the membrane trafficking of the cell. This work is part of the ongoing PhD study of Lisa Balsollier.
Parametric estimation of birth-death-move processes with mutations.
Marie-Colette van Lieshout, Changqing Lu, Maurits de Graaf and Paul Visscher
Chimney fires constitute one of the most commonly occurring fire types. Precise prediction and prompt prevention are crucial in reducing the harm they cause. We develop a combined machine learning and statistical modelling process to predict fire risk. Firstly, we use random forests and permutation importance techniques to identify the most informative explanatory variables. Secondly, we design a Poisson point process model and employ logistic regression
estimation to estimate the parameters. Thirdly, we establish strong consistency and asymptotic normality for the parameters and propose a consistent estimator
of the asymptotic covariance matrix that can be used to build confidence intervals for the predicted fire risk. Finally, we validate the Poisson model assumption using second-order summary statistics and residuals and implement the modelling process on data collected by the Twente Fire Brigade.
This approach has two advantages: i) with random forests, we can select explanatory variables nonparametrically considering variable dependence; ii) using logistic regression estimation, we can fit our statistical model
efficiently by tuning it to focus on regions and times that are salient for fire risk.
References: C. Lu, M.N.M. van Lieshout, M. de Graaf and P.J. Visscher. Data-driven chimney fire risk prediction using machine learning and point process tools. Annals of Applied Statistics, to appear. M.N.M. van Lieshout and C. Lu. Infill asymptotics for logistic regression estimators for spatio-temporal point processes. ArXiv 2208.12080, August 2022. Data-driven chimney fire risk prediction using machine learning and point process tools
Mehdi Moradi, Ottmar Cronie, Unai Pérez-Goya, Jorge Mateu
Detecting change-points in multivariate settings is usually carried out by analyzing all marginals either independently, via univariate methods, or jointly, through multivariate approaches. The former discards any inherent dependencies between different marginals and the latter may suffer from domination/masking among different change-points of distinct marginals. As a remedy, we propose an approach which groups marginals with similar temporal behaviours and then performs group-wise multivariate change-point detection. Our approach groups marginals based on hierarchical clustering using distances which adjust for inherent dependencies. Hierarchical Spatio-Temporal Change-Point Detection
| |||||
18:00-20:45 |
Posters session
Posters session Viktor Beneš
Motivation comes from materials research, when observing particles on grain boundaries in the microstructure of polycrystals. Given a bounded Laguerre tessellation, the homogeneous Poisson process on the planar network of faces is second-order pseudostationary when the shortest path distance on the planar network is considered. Using the coarea formula we derive formula for the geometrically corrected K-function and suggest an algorithm for an approximate numerical evaluation of the correction factor. Finally a cluster process on the planar network is introduced and investigated.
Point processes on 3D tessellation faces.
Juha Heikkinen, Helena M. Henttonen, Matti Katila, Sakari Tuominen
To avoid measuring lots of zeroes, Finnish National Forest Inventory (NFI, Korhonen et al. doi.org/10.14214/sf.10662) distributes field sample plots in sparsely forested northernmost part of the country with stratified sampling. Strata with small predicted proportion of forests are sampled less intensively. This is based on maps produced using earlier NFI measurements and satellite images (Mäkisara et al. http://urn.fi/URN:ISBN:978-952-380-538-5). In this study, we evaluated methods to sample within the fragmented strata. Among spatially regular designs, local pivotal method (Grafström et al. doi.org/10.1111/j.1541-0420.2011.01699.x) was more efficient than systematic sampling in stratified sampling, but – somewhat surprisingly – also in unstratified sampling.
Stratified, spatially regular and balanced sampling
Julia Jansson, Ottmar Cronie
In classical statistics, there is usually an underlying i.i.d. assumption.
However, in the field of spatial and temporal statistics we often have dependencies for example in forestry, seismology and epidemiology.
A point process may be viewed as a generalized random sample where we allow dependence and a random sample size.
Recently a prediction-based statistical theory was proposed called point process learning (PPL).
It is based on the combination of two novel concepts for general point processes: cross-validation and prediction errors.
We here explore how PPL can be exploited to model a range of common Gibbs point process models, like the hard-core process, the Strauss process and the Geyer saturation process.
Through a simulation study, we compare the performance of PPL in terms of MSE and MAPE to pseudo-likelihood methods, which is the state of the art in the context of the studied models.
We found that PPL outperforms pseudo-likelihood for all considered models.
In addition, we propose a procedure to obtain resample-based confidence regions for the model parameters.
The empirical coverage was studied for these so-called PPL-confidence intervals, and we found that there is merit to the confidence region idea. Point process learning for Gibbs process models
Kateřina Helisová, Vesna Gotovac Đogaš, Bogdan Radović, Jakub Staněk
The poster concerns a new statistical method for assessing similarity of two random sets based on one realisation of each of them. The method focuses on shapes of the components in the given realisations, namely on curvature of their boundaries together with ratios of their perimeters and areas. Brief theoretical background is introduced, the method is described, justified by a simulation study and applied to real data. Two-step method for assessing similarity of realisations of random sets
Konstantinos Konstantinou, Tomáš Mrkvička, Mikko Kuronen, Mari Myllymäki
Permutation-based global envelope tests allow us to perform simultaneous inference of the quantile regression process, even in the presence of nuisance covariates. When the covariate of interest is categorical, graphical n-sample tests for comparing distributions can be constructed. The global tests are applied to nerve fiber point patterns to compare the nerve tree size distributions between patients with diabetes and healthy controls. Global tests for quantile regression with applications in modeling distributions
Marcela Mandarić, Vesna Gotovac Đogaš
We use the tools of topological data analysis to detect some features of random sets in order to do the classification of random sets and for detecting outliers. We track the birth and death of connected components and cycles that emerge and disappear as the realization of the random set erodes and dilates by the disks of increasing radii. Persistence diagram keeps that information and we view it as an empirical measure. Statistical depth can be defined on that (random) measure, for example by using support functions of corresponding lift zonoid and applying methods for assigning depth to functional data. We apply it to real data concerning mastopatic and mammary cancer breast tissue histological images. We also use the persistence diagram for testing goodness of fit to Boolean model using so called accumulated persistence functions as test functions for global envelope test and compare the results with other test functions such as capacity functional,
spherical contact distribution function and support function of lift zonoid. It seems like it is very useful tool in recognizing whether clustering or repulsiveness occurs and can be used for classification of random sets. Nonparametric statistics for random sets via topological data analysis
Joel Kostensalo, Lauri Mehtätalo, Sakari Tuominen, Petteri Packalen and Mari Myllymäki
The estimation of forest attributes representing aspects of structural biodiversity from vast areas is difficult not only because of the lack of agreement on what constitutes a structurally diverse forest, but also due to difficulties regarding the detection of small trees. However, there is a consensus that a structurally diverse forest should have large variation in tree size (i.e., height and diameter), as well as some clustering of trees, as regular tree pattern is typical for managed forests. We developed a framework for building structurally representative tree maps using airborne laser scanning data combined with a limited number of ground measurements using data from two locations in Finland. In the proposed method, an individual tree detection algorithm is optimized so that the number of missing trees and false discoveries is minimized. The ground measurements are then used to train models for the number and location of missing trees and false discoveries. This model can then be applied to other areas in the proximity of the ground measurements to build tree maps, with the location, height, and diameter for each tree. The methodology was shown to reproduce the number of trees, the height distribution, and the spatial pattern of the trees with sufficient accuracy for practical use in large-scale mapping of forest attributes. Our next steps are to account for tree species when building tree maps. Creating countrywide tree maps with individual tree detection and bootstrapping
Alexis Pellerin, Juliette Blanchet, Jean-François Coeurjolly
We model cloud-to-ground lightning strikes collected over a period covering 2011-2021 for the French Alps using a spatio-temporal point process. This results to the observation of more than a million of points. With this characteristic in mind, we investigate the non-parametric estimation of the spatio-temporal intensity function as well as the intensity functions of the point processes aggregated over space and time. Then, we present a non-parametric approach to test the first-order separability of the spatio-temporal point process. Negative answer to this question leads to natural perspectives presented in the end of the extended summary. Intensity estimation and first-order separability testing of lightning strikes in the French Alps
Niklas Rottmayer, Claudia Redenbach
Geometric modelling of materials microstructures allows to generate synthetic samples, e.g. for training neural networks or simulating material properties. For fitting and comparing models, a measure of general similarity between realizations of geometric models is beneficial. Ideally, similarity should be measured based on features describing, e.g., directionality, porosity, surface texture or shape. We propose such similarity measures and evaluate them on Boolean models of spheres, ellipsoids, cylinders, cubes and cuboids. Measuring similarity between realizations of geometric models
Changqing Lu, Yongtao Guan, Ganggang Xu, Marie-Colette van Lieshout
Gradient boosting regression trees (GBRT), and in particular extreme gradient boosting trees (XGBoost), are widely adopted machine learning tools for classification and regression problems. We consider their utilities for improving first- and second-order estimation of spatial point processes. For intensity estimation, classic approaches include kernel estimators in the spatial or covariate domain and log-linear models that parametrically model covariate information. Compared to them, our XGBoost model is flexible, nonparametric, exploits both spatial and covariate information, avoids the adaptive bandwidth selection problem and is still computationally fast. Specifically, we consider customised loss functions related to Poisson and logistic likelihoods, and, in the spirit of the quasi-likelihood approach, we propose (dynamic) weighted loss functions to involve spatial correlations. For second-order estimation, in the recent literature, non-stationarity depending on covariate information to pair correlation functions was considered. Following that, we propose an XGBoost model to estimate them nonparametrically based on both covariates and inter-point distances by designing a loss function of composite likelihood. Several simulation studies were conducted to validate our models. Gradient boosting regression trees for first- and second- order estimation of spatial point processes
J. Staněk, I. Karafiátová, J. Møller, Z. Pawlas, F. Seitl, V. Beneš
Polycrystalline materials are widely used in various applications and are often modeled using stochastic methods to simulate their 3D microstructure. We propose a methodology that models the joint distribution of crystallographic orientations conditioned on a 3D Laguerre tessellation, accounting for the interaction between neighboring grains' orientations. Estimation of the model parameters and methods for model comparison are discussed. Finally, our approach is demonstrated using a nickel-titanium shape memory alloy dataset. Fitting the crystallographic orientation distribution conditioned on a Laguerre tessellation
| Cocktail [espace convivialité] |
9:00-10:30 |
Movement analysis in ecology
Movement analysis in ecology Emiko Dupont, Matthew Nunes, Megan Laxton, Janine Illian
Spatial point patterns provide important insights into the behaviour of animals and eco-systems. Many ecological studies emphasise the influence of scale, that is, the spatial resolution of the data. However, in practice, the choice of scales is often ad hoc, more likely determined by practical constraints than what is of biological interest. Using Fourier and wavelet methods, typically used for identifying frequency information in signals and images, we develop tools for a multiscale approach to the analysis of point pattern data in ecology. Scale analysis in ecology
Megan Laxton, Finn Lindgren, Janine Illian, Jason Matthiopoulos, Man Ho Suen, Paul Blackwell
Animal movement studies vary widely in spatiotemporal resolutions, model types, and data collection methods, resulting in numerous windows of insight into multiscale habitat preferences and movement processes. For example, telemetry tagging provides insight into local habitat preferences of individuals, whereas survey data can uncover global-scale preferences of the population as a whole. Different data types and modelling methodologies have their own pros and cons, as well as varying sources of bias and perspectives on the underlying processes of interest. Using data integration and joint modelling, we can strengthen estimation of selection parameters, improving overall understanding of animal habitat preferences.
In this talk, I will discuss an ongoing project which aims to bring together two animal movement modelling perspectives (resource selection functions and step selection functions) into a joint model, integrating across multiple data sources and spatial scales. Survey and telemetry data are jointly modelled, combining point process methodology with Langevin diffusion-based movement modelling. Future developments involve implementation of the approach in R-inlabru. Use of the Integrated Nested Laplace Approximation (INLA) methodology and Stochastic Partial Differential Equation (SPDE) approach allows for the incorporation of a Gaussian Random Field (GRF) in the linear predictors of the joint model, accounting for spatial autocorrelation and reducing the risk of spurious significance in estimation of selection parameters. Multiscale Animal Movement Modelling in Inlabru
Paul Blackwell
I will talk about a simple class of piecewise deterministic Markov processes (PDMPs) which have some applications in the modelling of animal movement in continuous time, and a fully Bayesian approach to inference for them using Reversible Jump Markov Chain Monte Carlo. I will also look at a more general class of PDMPs that are used as continuous-time MCMC algorithms, the bouncy particle sampler and its generalisations, and explore how those processes can be used to extend the movement modelling above. In particular, they can be used to define movement models that have tractable stationary distributions. Expressing those stationary distributions as a functin of spatial covariates then gives a mechanism for incorporating resource selection in a coherent way, extending recent developments in discrete-time and diffusion modelling (e.g. Michelot, Blackwell & Matthiopoulos (2019), Ecology 100(1) e02452). Inference for Piecewise Deterministic Models of Animal Movement
| |||||
10:30-11:00 | Coffee break with fruits | |||||
11:00-12:00 |
Random sets
Random sets Vesna Gotovac Đogaš
We present several depths for possibly non-convex random sets. The depths are applied to the comparison between two samples of non-convex random sets, using a visual method of DD-plots and statistical tests. The advantage of this approach is to identify sets within the sample that are responsible for rejecting the null hypothesis of equality of the distribution and to provide clues on differences between the distributions. The method is justified on the basis of a simulation study.
Depth for samples of non-convex sets with applications to testing equality in the distribution of two samples of random sets
Iva Karafiátová
Our research focuses on the analysis of three-dimensional microstructures and crystallographic orientations in polycrystalline materials. Due to the presence of symmetries, the representation of these orientations can be ambiguous, posing a challenge for the analysis. To overcome this challenge, we propose a novel multivariate asymptotic test of independence of orientations, which may be also used to better understand the spatial distribution of the orientations in real materials. Multivariate asymptotic test of independence of orientations
| |||||
12:00-13:00 | Lunch [espace convivialité] | |||||
13:00-15:00 |
Computational methods
Computational methods Christophe A.N. Biscio, Frédéric Lavancier
The intensity of a spatial point process is one of the first quantity of interest to estimate in presence of real-data. When no covariate is observed, non-parametric kernel estimation is routinely used, but comes with some drawbacks: it adapts poorly to non-convex domain, and the estimation is not consistent in an increasing domain asymptotic regime. When the intensity depends on observed covariates, most estimation methods are parametric. Non-parametric kernel estimation has been extended to this situation, but it appears to be efficient only for a few numbers
of covariates, and provides no indication on the importance of each covariate, which is crucial for interpretation.
In this talk, we show how to adapt random forest regression to circumvent these drawbacks and estimate non-parametrically the intensity of a spatial point process with or without covariates, while measuring the importance of each variable in
the latter case. Our approach allows to handle non-convex domain together with a large number of covariates. From a theoretical side, we prove that in the case of purely random forests, our method is consistent in both infill and increasing domain asymptotic regime, and may achieve a minimax rate of convergence. Non-parametric intensity estimation of spatial point processes by random forests
Ottmar Cronie, Mehdi Moradi, Christophe A.N. Biscio
This talk presents a new cross-validation (CV) based statistical theory for point processes, motivated by CV’s general ability to reduce overfitting and mean square error. It is based on the combination of two novel concepts for point processes: CV and prediction errors. Our CV approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point processes. The new approach exploits the prediction errors to measure how well a given model predicts validation sets using associated training sets. Having discussed its components and properties, we employ it to carry out non-parametric intensity estimation and show, numerically, that it here outperforms the state of the art. Point process learning: A cross-validation-based approach to statistics for point processes
Claudia Redenbach, Katja Schladitz, Christian Jung, Tin Barisin, Chiara Fend, Niklas Rottmayer
Neural networks are commonly used for image segmentation. Generation of training data by manual annotation of images is common practice, but time consuming and error prone.
We suggest to use synthetic image data. For their simulation, virtual microstructures are generated from stochastic geometry models. Then, a model for the imaging process is applied to simulate realistic images of the synthetic structures. Binary images of the model realizations yield a ground truth for the segmentation. The resulting pairs are then used to train the neural network. We present two examples of application: segmentation of cracks in µCT images of concrete and of FIB-SEM images of porous structures.
Image segmentation by neural networks trained on synthetic data
Ege Rubak, Suman Rakshit, Gopalan Nair, Adrian Baddeley
We clarify the interpretation of the Receiver Operator Characteristic (ROC) curve for spatial point patterns. Contrary to statements in the literature, ROC does not measure goodness-of-fit of a spatial model, and its interpretation as a measure of predictive ability is weak. To gain insight we draw connections between ROC and other statistical techniques for spatial data. The area under the ROC curve (AUC) is related to hypothesis tests of the null hypothesis that the explanatory variables have no effect. This suggests several new techniques, which extend the scope of application of ROC curves for spatial data, to support variable selection and model selection, analysis of segregation between different types of points, adjustment for a baseline, and analysis of spatial case-control data. ROC curves for spatial point patterns
| |||||
16:30- | Meeting at the bubbles to visit LA Bastille | |||||
19:30- | Conference dinner in the city center of Grenoble |
9:00-10:30 |
Determinantal point processes and Monte-Carlo
Determinantal point processes and Monte-Carlo Ayoub Belhadji
The problem of reconstructing a continuous function based on discrete samples stimulated considerably rich literature. We propose a universal approach for function reconstruction based on repulsive nodes that comes with strong theoretical guarantees and empirical performances. More precisely, we study reconstructions based on nodes that follow the distributions of determinantal point processes adapted to a given reproducing kernel Hilbert space. We prove fast convergence rates that depend on the eigenvalues of the kernel. This unified analysis provides new insights into approximation problems based on determinantal point processes. Function reconstruction using determinantal sampling
Diala Hawat, Raphaël Lachièze-Rey, Rémi Bardenet
The Monte Carlo method is a widely used computational technique for estimating the integral of a function by summing its values at points from a Poisson point process. An alternative approach using Determinantal Point Processes (DPP) has recently garnered attention due to its reduced variance compared to the classical Monte Carlo method. However, the high computational cost of sampling from a DPP makes it impractical for many applications.
To overcome this limitation, we introduce a one-step dynamic that moves the points of a homogeneous Poisson point process while maintaining a sampling complexity that is lower than that of a DPP.
We proved that our proposed approach results in a Monte Carlo variant with reduced variance, compared to the classical Monte Carlo method.
We conducted several numerical experiments to evaluate the effectiveness of our method and compared our results with other Monte Carlo variants.
Overall, our proposed Monte Carlo method offers a promising variance reduction method with a tractable computational time. Monte Carlo method with the Repelled Poisson point process
Simon Barthelmé, K Usevich, N Tremblay and P-O Amblard
In applications of DPPs to machine learning, it is common to define DPPs from "L-ensembles", and one of the usual kernels like the Gaussian kernel or the Matérn kernel. These kernels feature an undesirable parameter that determines spatial scale, that has no natural value when defining DPPs. We show that this parameter can be eliminated by taking a limit that makes the kernels "flat". The resulting DPPs are reasonable and useful, and in the one-dimensional cases we recover some of the "universal" limits that feature in random matrix theory. In higher dimensions the limits are related to orthogonal polynomials, or to eigenfunctions of the Laplace operator. DPPs in the Flat Limit
| |||||
10:30-11:00 | Coffee break with fruits | |||||
11:00-12:00 |
Non parametric testing
Non parametric testing Chiara Fend, Claudia Redenbach
In this talk we review both classical and recent approaches for goodness-of-fit tests in the setting of spatial point processes. The latest contributions include functional summary statistics originating from topological data analysis, test procedures such as global envelope tests, and scoring rules such as the continuous ranked probability score.
We discuss novel ways of how one can combine multiple aspects from different test statistics into combined goodness-of-fit tests.
A power study for planar stationary and isotropic point processes compares the robustness of the tests.
Goodness-of-Fit Tests for Spatial Point Processes
Tomáš Mrkvička
Parametric methods in spatial statistics are well-developed, but these methods often
lead to problems when the model is wrongly specified. The pitfall of the wrong specification
may come from misspecification of the autocorrelation structure of random fields,
misspecification of the form of points interaction, misspecification of the model of dependence
of one spatial object to other spatial objects, or misspecification of the dependence
of data on the sample locations, etc. Such misspecifications may lead to the liberality of
the tests higher than 0,25 when 0,05 is the target nominal level. In addition, the parametrical
methods are usually based on asymptotic approximations, which can be for real
data far from reality due to the long range of dependencies. These issues can lead to
slight liberality, which can be higher than 0,08 according to our simulation studies if all
the assumptions of the parametric models are met.
On the other hand, the nonparametric methods are free of these assumptions, and they
are designed to work correctly even in non-asymptotic situations due to the resampling
strategies. Therefore, we built the R package NTSS, which collects Nonparametric Tests in
Spatial Statistics. The package contains tools for selecting the relevant spatial covariates
influencing the point pattern based on the independence test of the point pattern and a
covariate with the presence of the nuisance covariates. Further, it contains tools for disentangling
the dependence between points, marks, and covariate, the test of independence
between two point patterns and two random fields. Advantages of nonparametric testing in spatial statistics
| |||||
12:00-13:30 | End of the workshop and lunch [espace convivialité] |