Transformation function
Abstract
In this chapter, we provide an overview of several data-driven techniques for wireless localization. We initially discuss shallow dimensionality reduction (DR) approaches and investigate a supervised learning method. Subsequently, we transition into deep metric learning and then place particular emphasis on a transformer-based model and self-supervised learning. We highlight a new research direction of employing designed pretext tasks to train AI models, enabling them to learn compressed channel features useful for wireless localization. We use datasets obtained in massive multiple-input multiple-output (MIMO) systems indoors and outdoors to investigate the performance of the discussed approaches.
Keywords
- wireless localization
- dimensionality reduction
- massive MIMO
- metric learning
- transformer
- self-supervised
- deep learning
- CSI
- channel representations
1. Introduction
As we advance toward the sixth-generation (6G), the capabilities of mobile communication networks are anticipated to evolve far beyond their present role of connecting individuals or machines [1, 2]. Among different foreseen application scenarios, wireless localization, sensing, and artificial intelligence (AI) are the most critical ones for advanced future communication systems [2]. AI, in more general terms, incorporates a range of techniques, primarily machine learning (ML) methods, which allow machines to
1.1 Wireless localization with deep learning
Wireless localization methods can be divided into two categories: model-based and data-driven [3]. Model-based techniques require knowledge of the geometric relationship between the estimated parameters of the received signal and the position of the transmitter [4]. Therefore, the performance of the existing model-based methods is heavily degraded when such a relationship is not available, for example, in non-line-of-sight (NLOS) or dense multi-path propagation conditions [4, 5, 6, 7]. Consequently, data-driven approaches, specifically ML, have emerged over the years [8]. Among the various ML approaches, deep neural network (DNN)-based models have demonstrated exceptional localization accuracy [9, 10, 11, 12, 13, 14, 15].
A common approach in cellular-based localization systems is to use derived features of the estimated channel at the base station (BS), such as signal time of arrival (ToA), angle of arrival (AoA), received signal strength (RSS), or a combination of them [6, 16, 17]. On the other hand, DNN-based methods prefer utilizing the channel state information (CSI) in its acquired form [9, 10, 13, 18, 19]. The acquisition of CSI is particularly important for advanced communication systems that use massive multiple-input multiple-output (MIMO) [20, 21]. Hence, CSI is, in general, readily available for localization. Massive MIMO, characterized by a large antenna array at the BS, is widely regarded as the key technology for the fifth-generation (5G) [22, 23] and forthcoming communication systems [24]. Employing a considerable number of antennas enhances the angular resolution of the received multipath signal, thereby benefiting localization methods [25, 26, 27].
ML-based methods can generally be supervised or unsupervised, depending on whether labeled training data are necessary. A vast majority of DNN localization techniques are supervised and constrained to a task-specific feature learning, raising concerns particularly about the ability to work across diverse scenarios and adapt in data-scarce environments. On the other hand, developing DNN methods that sustain good accuracy and transferability is an essential research topic for future wireless localization systems.
In this chapter, we cover three approaches for wireless localization. We investigate conventional dimensionality reduction (DR) techniques, like PCA and basic manifold approaches. Afterward, we discuss a supervised learning method that uses multi-layer perceptrons (MLP). Furthermore, we extend the basic MLP into a deep metric learning method. Finally, we transition into advanced DNN architectures such as transformers, combined with sophisticated learning frameworks like self-supervised learning, emphasizing its growing research importance in wireless localization and beyond.
Specifically, we structure the rest of this chapter as follows. First, in Section 2, we outline the system model to generate synthetic datasets for validating DNNs. In Section 3, we investigate classical DR approaches. Then, in Section 4, we discuss supervised DNN methods and deep metric learning. In Section 5, we elaborate on more advanced neural network architectures and learning frameworks. Finally, we draw our conclusions in Section 6.
2. System model
We consider a massive MIMO uplink setup as illustrated in Figure 1. The base station has
2.1 Signal model
The uplink input-output relationship of a single-user transmission from
Here,
2.2 Channel model
The DNN techniques detailed in this chapter are data-driven and therefore are not constrained to a particular channel model. Nevertheless, it is beneficial to clarify the channel using location-related parameters like distances and angles for investigation. Consequently, we utilize a commonly adopted geometric channel model to characterize the estimated channel of the received signal for such scenarios as that illustrated in Figure 1 [28],
where
Here,
2.3 Dynamic scenario
We consider that the environment may change over the time. When using synthetic data to evaluate the proposed methods in this chapter, we account for situations where some scattering objects change their positions over the time interval
Next, we stack
where
where
3. Shallow DR techniques
In this section, we discuss dimensionality reduction approaches to obtain a channel representation useful for positioning the UE. Specifically, we first investigate shallow approaches based on principle component analysis and iterative scaling.
3.1 Principle component analysis and iterative scaling
Since principle component analysis (PCA) is the most widely used technique for lossy data compression or feature extraction, we first illustrate the efficacy of the PCA-derived channel subspace when used to determine the UE location. To do so, let us consider a single-path line-of-sight (LOS) channel, that is,
and center it, that is,
Our goal is to construct a representation map of (pseudo-)locations,
matching the first
As an optimization problem, PCA is closely related to the so-called multidimensional scaling (MDS) [30], or metric MDS. MDS is another reconstruction method to obtain a representation
In addition to using PCA to construct a low-dimensional map representation by optimizing a convex objective using eigendecomposition, we next look into an alternative to classical scaling, which is known as MDS with Sammon mapping [33]. Sammon mapping is an iterative, gradient-based approach to evaluate a non-convex objective function and is considered a generalization of metric MDS. In this case, we seek to find an optimal representation by normalizing the squared errors using the pairwise distance in the original features space. To obtain the channel features while reducing the distance between low- and high-dimensional representations, we minimize the cost function.
Assuming a subset of CSI has labels, its low-dimensional features can serve as either a reference map or an input to a task-specific model to derive the final location of the UEs. Below, we aim to capture the CSI manifold and investigate the localization performance when CSI is projected onto, for instance, a
and choose
Acquiring the reference map poses a challenge in determining the optimal number of low-dimensional features,
3.2 Location estimation using D-dimensional features
We set up a simple two-dimensional ROI with a layout of
In the next section, we discuss wireless localization with neural networks in general and deep metric learning in particular.
4. Basic DNN methods and deep metric learning
Numerous works propose supervised training methods for CSI-based localization, where the utilized channel features are labeled with a corresponding target location information (e.g., position coordinates of the transmitter) [10, 13, 18, 19, 35, 36, 37]. These techniques establish a mapping function between the obtained CSI and the corresponding position coordinates, intending to achieve accurate localization performance on new, unseen data.
In Figure 3, we show a straightforward and relatively low-complexity localization method based on a feedforward neural network, that is, an MLP. Throughout this section of the chapter, we will refer to it as the base DNN, as we will use it to compare it with a metric-learning approach or as a basis for the proposed model. In the following, we set the number of hidden layers to
The wireless localization problem for supervised models is often formulated as a regression task. In such a case, we consider the base DNN as a function
estimated by averaging over the batch of training samples.
Alternatively, we consider a classifier that can learn to separate
Here,
In general, supervised DNN-based methods excel in any task where extensive labeled datasets are readily available for training. However, collecting large-scale geo-tagged channel estimates for different mobile network tasks can be time-consuming, error-prone, and, in many cases, impractical. Thus, in the next section, we discuss more advanced methods that can learn channel features from unlabeled data.
4.1 Metric learning DNN and contrastive task
In contrast to the DR approaches we discussed in Section 3, in deep metric learning, we obtain an embedding space using neural networks. Consequently, the objective of the DNN becomes to learn a
The number of networks used in a Siamese-based method can be any. However, it is usually two or three. A Siamese architecture with three equivalent networks is also called a triplet. Hence, the name triplet network for the model. Specifically, the network is composed of three branches. Each network employs the same hyperparameters as the base DNN for the intermediate computation layers. For the final layer (i.e., the output), the identity function is utilized,
4.1.1 Contrastive task
In order to be able to design the contrastive task and therefore sample channels from different regions, we partition the ROI into RPs’s sub-regions. We sample similar and non-similar CSI; that is, if two CSI vectors correspond to the same sub-region, they are considered similar; otherwise, they are not similar. For obtaining the triplets, we consider every
Since all three networks illustrated in Figure 4 share the same weights, we implement the three branches using a single network, the base DNN. During the training phase, we successively feed the channel realizations within each triplet. Consequently, we obtain their respective embeddings
where
To evaluate the performance gains of the triplet network, we compare it to the supervised classifier discussed in Section 4.
4.1.2 Impact of LOS and density of reference locations
In the simulations conducted, we choose the model that demonstrates the lowest validation loss over
The desired positioning accuracy is primarily determined by the density of RPs. Increasing RPs, the system’s ability for higher accuracy increases, too. However, the presence or the absence of a LOS path in a multipath channel also influences the accuracy of the prediction. Consequently, it limits the accuracy even when a high density of RPs is used.
In Figure 5, we show the impact of LOS with the increased density of RPs for both DNNs, the classifier and the triplet network. In the presence of a LOS path, a higher density of RPs enhances the overall location estimation accuracy. The triplet network surpasses the base DNN classifier when
5. Advanced DNN localization methods
Predominantly, previous works use raw CSI to feed their respective proposed DNN architectures. Convolutional neural networks (CNNs) and MLPs are the main components of such neural network architectures. As the system bandwidth and the number of antenna elements at the BS become larger [41], the dimensionality of CSI also increases. This can pose a challenge for methods that rely solely on MLPs, like an insufficient number of units in the input layer. Increasing the units would increase the number of parameters. As a consequence, the capacity of the model would become higher, and therefore, a more extensive training set is needed. Furthermore, MLPs cannot capture local correlations [42], for example, the channel at neighboring antennas or subcarriers, information that would be common even for hand-feature extractors. Finally, due to their fully connected nature, they have no mechanism to ensure invariance with respect to small-scale variations in the input channel. Hence, they might not be the optimal choice for channel-feature learning or channel-to-location mapping.
To cope with the issues of imperfect channel estimates and other system impairments, various works suggest the conventional option of hand-designing feature extractors for more robust models. A common idea in the literature for hand-engineered features is to leverage the channel transform domains, such as angle, delay, or Doppler, for example, [35, 43]. However, hand-crafting input features limit the expressive capacity of DNN models, hence constraining the generalization of learned representations or the trained model.
Several studies have recommended the use of convolutional neural networks (CNNs) to better learn channel attributes essential for mapping the channel-to-location [10, 14, 44]. Nevertheless, CNNs introduce a significant inductive bias by employing filters to
In contrast to long-standing approaches, transformer-based architectures proposed more recently in natural language processing [45], computer vision [46], or wireless communications [29] adopt the
5.1 Wireless transformer
As presented in [29], and depicted in Figure 6, WiT is a fully supervised technique and is trained to minimize
In contrast to prior DNN methods, we view the input channel
Transformer-based models, due to their lack of recurrence or standard convolutional operations, treat all subcarrier representations as permutation invariant, ignoring the sequence order of frequency-dependent subcarriers. To address this limitation, we incorporate positional encodings to represent the sequence position of each subcarrier. More specifically, we add a random and learnable real-valued vector embedding,
As for the input to the MLP head, which follows the transformer block, we either average the derived features or utilize an extra symbol,
5.1.1 Attention
Central to
where
Moreover, the embedding
WiT leverages the per-subcarrier channel structure and relies on learning the large-scale channel features either by averaging out the representations learned from subcarriers or by obtaining a unique representation across the entire channel. However, the wireless channel is characterized by both macroscopic and microscopic fading. Consequently, we extend WiT to a self-supervised learning framework, that is, self-supervised wireless transformer (SWiT) [51], an approach that utilizes both microscopic as well as macroscopic fading characteristics of the channel. Building upon the advantages of self-supervised training, the method in [51] may enable and facilitate several potential applications in wireless communications, extending beyond the localization task. Learned channel representations, that is, embeddings, can serve as pseudo-locations and facilitate different tasks, ranging from beamforming to localization for the purpose of different location-based services (LBS). For instance, it could be used to determine if two transmitters are close to a reference location, or a
5.2 Self-supervised wireless transformer
In contrast to other studies in wireless localization, in this part of the chapter, we discuss our approach to exploiting redundant and complementary information across the subcarriers. Doing so enables us to predict the channel from a single realization. While it might seem counterintuitive to learn to predict the information we possess, we later demonstrate that such an approach can be suitable for deriving meaningful representations for estimating different wireless communication tasks. In contrast to the triplet network detailed in Section 4.1, where we aim to distinguish channels from different sub-regions, here we avoid the necessity of sampling negative pairs and employing a contrastive loss. Furthermore, we show how to leverage the microscopic fading characteristics by designing subcarrier-level
In the following, we use SWiT to derive a channel representation, denoted as
Next, all channel views are processed sequentially via the encoders
Given that the wireless channel exhibits both macroscopic and microscopic fading, our method is tailored to address this behavior. Therefore, the learning is split into two separate projectors, namely, micro-fading and macro-fading channel representation learning modules.
5.2.1 Micro-fading level representations
The micro-fading representation module uses a pretext task to obtain representations at the subcarrier level. In the case of SWiT, for each representation
5.2.2 Macro-fading level representations
In contrast to the micro-fading module, the macro-fading module of SWiT processes all channel views to output the respective embeddings
The overall loss function is given by
5.3 Datasets and self-supervised training
We use real-world channel measurements and synthetic data to further benchmark the performance of the SWiT.
For actual measurements, we selected the
We also choose a synthetic dataset generated based on the discussion in Section 2.2. The datasets are referred to as S-200 and HB-200 [29], representing two dynamic railway scenarios, each with
5.3.1 Self-supervised training
For both the online and target models, specifically
When training SWiT with a single GPU, it may require fine-tuning the weight decay and learning rate depending on the dataset size and the number of iterations (i.e., the batch size). For most of the experiments in this chapter, we trained SWiT without labels for varying epoch lengths, using a batch size of
Subcarrier selection (RSS) | |||
Subcarrier flipping (RSF) | |||
Gain offset (RGO) | |||
Fading component (RFC) | |||
Sign change (RSC) | |||
Normalization | |||
Gaussian noise |
5.4 Localization performance and transferability
To assess the performance of learned representations, we perform linear and fine-tuning evaluation for the trained models on several localization tasks. We also investigate the transferability of the models to other tasks and datasets.
5.4.1 Localization accuracy with limited data
For linear analysis, we train a regressor
To assess the performance of the embeddings, we also perform fine-tuning using labeled data. Specifically, the backbone (i.e., target encoder) is initialized with the pre-trained weights, forming the new network
In Figure 9, we highlight the improvements of SWiT in a small data regime for the case of KUL-NLOS dataset. In contrast to the analysis in [51], we offer additional insights here, highlighting when a simple linear approach is sufficient and when the computationally demanding WiT and SWiT are justified. For the data regimes in the figure, we investigate the RMSE separately for six different randomly sampled datasets. Datasets for evaluation are sampled based on the Poisson point process (PPP) with varying density parameter values
5.4.2 Spot-localization and transfer learning
Table 2 reports the model’s ability to
KUL-NLOS | KUL-LOS | KUL-LOS-DIS | S-200 | HB-200 | |||||
---|---|---|---|---|---|---|---|---|---|
Method | |||||||||
Random | |||||||||
SWiT | |||||||||
SWiT + TF |
In Figure 10, we show the two-dimensional t-SNE [53] embeddings of a merged KUL dataset, encompassing datasets in LOS and NLOS conditions, resulting in a total of
6. Conclusion
In this chapter, we presented three different learning approaches and introduced multiple DNN-based models. First, we used classical PCA and metric scaling subspace methods to obtain useful channel features to determine the UE location. Then, we discussed neural networks as feature extractors and demonstrated the performance gains of the proposed triplet network when compared to an MLP classifier. Finally, we introduced self-supervised learning for wireless channel representation learning. We showed that we can design
Acknowledgments
This work has been funded by the Christian Doppler Laboratory for Digital Twin assisted AI for sustainable Radio Access Networks, Institute of Telecommunications, TU Wien. The financial support by the Austrian Federal Ministry for Labour and Economy and the National Foundation for Research, Technology and Development and the Christian Doppler Research Association is gratefully acknowledged.
Abbreviations
artificial intelligence | |
sixth-generation | |
machine learning | |
deep neural network | |
channel state information | |
user equipment | |
multiple-input multiple-output | |
base station | |
line-of-sight | |
non-line-of-sight | |
distributed antenna system | |
remote radio head | |
principle component analysis | |
multidimensional scaling | |
k-nearest neighbors | |
multi-layer perceptrons | |
reference point | |
self-supervised learning | |
wireless transformer | |
self-supervised wireless transformer |
References
- 1.
Rong B. 6G: The next horizon: From connected people and things to connected intelligence. IEEE Wireless Communications. 2021; 28 (5):8-8 - 2.
You X, Wang C-X, Huang J, Gao X, Zhang Z, Wang M, et al. Towards 6G wireless communication networks: Vision, enabling technologies, and new paradigm shifts. Science China Information Sciences. 2021; 64 :1-74 - 3.
Wen F, Wymeersch H, Peng B, Tay WP, So HC, Yang D. A survey on 5G massive MIMO localization. Digital Signal Processing. 2019; 94 :21-28 - 4.
Wymeersch H, Seco-Granados G. Radio localization and sensing—Part ii: State-of-the-art and challenges. IEEE Communications Letters. 2022; 26 (12):2821-2825 - 5.
Wylie M, P, Holtzman J. The non-line of sight problem in mobile location estimation. In: Proceedings of ICUPC-5th International Conference on Universal Personal Communications. Vol. 2. Cambridge, MA, USA: IEEE; 1996. pp. 827-831 - 6.
Sayed AH, Tarighat A, Khajehnouri N. Network-based wireless location: Challenges faced in developing techniques for accurate wireless location information. IEEE Signal Processing Magazine. 2005; 22 (4):24-40 - 7.
Witrisal K, Meissner P, Leitinger E, Shen Y, Gustafson C, Tufvesson F, et al. High-accuracy localization for assisted living: 5G systems will turn multipath channels from foe to friend. IEEE Signal Processing Magazine. 2016; 33 (2):59-70 - 8.
Burghal D, Ravi AT, Rao V, Alghafis AA, Molisch AF. A comprehensive survey of machine learning based localization with wireless signals. arXiv. 2020 - 9.
Wang X, Gao L, Mao S, Pandey S. Deepfi: Deep learning for indoor fingerprinting using channel state information. In: 2015 IEEE Wireless Communications and Networking Conference (WCNC). New Orleans, LA, USA: IEEE; 2015. pp. 1666-1671 - 10.
Vieira J, Leitinger E, Sarajlic M, Li X, Tufvesson F. Deep convolutional neural networks for massive MIMO fingerprint-based positioning. In: 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). IEEE; 2017. pp. 1-6 - 11.
Niitsoo A, Edelhäußer T, Mutschler C. Convolutional neural networks for position estimation in tdoa-based locating systems. In: 2018 International Conference on Indoor Positioning and Indoor Navigation (IPIN). Nantes, France: IEEE; 2018. pp. 1-8 - 12.
Gante J, Falcao G, Sousa L. Deep learning architectures for accurate millimeter wave positioning in 5g. Neural Processing Letters. 2020; 51 (1):487-514 - 13.
De Bast S, Guevara AP, Pollin S. CSI-based positioning in massive mimo systems using convolutional neural networks. In: 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring). Antwerp, Belgium: IEEE; 2020. pp. 1-5 - 14.
Ayyalasomayajula R, Arun A, Wu C, Sharma S, Sethi AR, Vasisht D, et al. Deep learning based wireless localization for indoor navigation. In: Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. New York, NY, USA: Association for Computing Machinery (ACM); 2020. pp. 1-14 - 15.
Salihu A, Schwarz S, Rupp M. Towards scalable uncertainty aware DNN-based wireless localisation. In: 2021 29th European Signal Processing Conference (EUSIPCO). Dublin, Ireland. 2021. pp. 1706-1710 - 16.
Zekavat R, Michael R, Buehrer. Handbook of Position Location: Theory, Practice and Advances. Vol. 27. New Jersey, USA: John Wiley & Sons; 2011 - 17.
Rupp M, Schwarz S. An LS localisation method for massive MIMO transmission systems. In: ICASSP - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brighton, UK: IEEE; 2019. pp. 4375-4379 - 18.
Arnold M, Hoydis J, ten Brink S. Novel massive mimo channel sounding data applied to deep learning-based indoor positioning. In: SCC 2019; 12th International ITG Conference on Systems, Communications and Coding. Rostock, Germany: VDE; 2019 - 19.
Hoang MT, Yuen B, Ren K, Dong X, Lu T, Westendorp R, et al. A CNN-LSTM quantifier for single access point CSI indoor localization. arXiv. 2020 - 20.
Adhikary A, Nam J, Ahn J-Y, Caire G. Joint spatial division and multiplexing—The large-scale array regime. IEEE Transactions on Information Theory. 2013; 59 (10):6441-6463 - 21.
Ngo HQ, Larsson EG, Marzetta TL. Energy and spectral efficiency of very large multiuser mimo systems. IEEE Transactions on Communications. 2013; 61 (4):1436-1449 - 22.
Lu L, Li GY, Lee Swindlehurst A, Ashikhmin A, Zhang R. An overview of massive mimo: Benefits and challenges. IEEE Journal of Selected Topics in Signal Processing. 2014; 8 (5):742-758 - 23.
Marzetta TL. Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Transactions on Wireless Communications. 2010; 9 (11):3590-3600 - 24.
Schwarz S, Pratschner S. Multiple antenna systems in mobile 6G: Directional channels and robust signal processing. IEEE Communications Magazine. 2023; 61 (4):64-70 - 25.
Schmidt R. Multiple emitter location and signal parameter estimation. IEEE Transactions on Antennas and Propagation. 1986; 34 (3):276-280 - 26.
Krim H, Viberg M. Two decades of array signal processing research: The parametric approach. IEEE Signal Processing Magazine. 1996; 13 (4):67-94 - 27.
Liu W, Haardt M, Greco MS, Mecklenbräuker CF, Willett P. Twenty-five years of sensor array and multichannel signal processing: A review of progress to date and potential research directions. IEEE Signal Processing Magazine. 2023; 40 (4):80-91 - 28.
Heath RW, Gonzalez-Prelcic N, Rangan S, Roh W, Sayeed AM. An overview of signal processing techniques for millimeter wave mimo systems. IEEE Journal of Selected Topics in Signal Processing. 2016; 10 (3):436-453 - 29.
Salihu A, Schwarz S, Rupp M. Attention aided CSI wireless localization. In: 2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communication (SPAWC). Oulu, Finland: IEEE; 2022. pp. 1-5 - 30.
Torgerson WS. Multidimensional scaling: I. Theory and method. Psychometrika. 1952; 17 (4):401-419 - 31.
Van Der Maaten L, Postma E, Van den Herik J. Dimensionality reduction: A comparative. Journal of Machine Learning Research. 2009; 10 (66–71):13 - 32.
Williams CKI. On a connection between kernel PCA and metric multidimensional scaling. Machine Learning. 2002; 46 (1–3):11-19 - 33.
Sammon JW. A nonlinear mapping for data structure analysis. IEEE Transactions on Computers. 1969; 18 (5):401-409 - 34.
Moltchanov D. Distance distributions in random networks. Ad Hoc Networks. 2012; 10 (6):1146-1166 - 35.
Sun X, Chi W, Gao X, Li GY. Fingerprint-based localization for massive MIMO-OFDM system with deep convolutional neural networks. IEEE Transactions on Vehicular Technology. 2019; 68 (11):10846-10857 - 36.
Wang X, Wang X, Mao S. Deep convolutional neural networks for indoor localization with CSI images. IEEE Transactions on Network Science and Engineering. 2020; 7 (1):316-327 - 37.
Salihu A, Schwarz S, Rupp M. Learning-based remote radio head selection and localization in distributed antenna system. In: 2022 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit). Grenoble, France: IEEE; 2022. pp. 65-70 - 38.
Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R. Signature verification using a “Siamese” time delay neural network. Advances in Neural Information Processing Systems. 1994:737-744 - 39.
Schroff F, Kalenichenko D, Philbin J. Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE; 2015. pp. 815-823 - 40.
Salihu A, Schwarz S, Pikrakis A, Rupp M. Low-dimensional representation learning for wireless CSI-based localisation. In: 16th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob) (50308). Thessaloniki, Greece: IEEE; 2020. pp. 1-6 - 41.
ETSI. Study on New Radio (NR) Access Technology. Technical Specification (TS) 38.912. Valbonne, France: European Telecommunications Standards Institute (ETSI); 2021. Version 15.0.0 - 42.
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 1998; 86 (11):2278-2324 - 43.
Ferrand P, Decurninge A, Guillaud M. DNN-based localization from channel estimates: Feature design and experimental results. In: GLOBECOM 2020–2020 IEEE Global Communications Conference. Taipei, Taiwan: IEEE; 2020. pp. 1-6 - 44.
De Bast S, Pollin S. MaMIMO CSI-based positioning using CNNs: Peeking inside the black box. In: 2020 IEEE International Conference on Communications Workshops (ICC Workshops). Dublin, Ireland: IEEE; 2020. pp. 1-6 - 45.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017; 30 - 46.
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv. 2020 - 47.
Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv. 2014 - 48.
Zhao H, Jia J, Koltun V. Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA: IEEE; 2020. pp. 10076-10085 - 49.
Devlin J, Chang M-W, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. San Diego, United States: ICLR 2015; 2018 - 50.
Ba JL, Kiros JR, Hinton GE. Layer normalization. arXiv. 2016 - 51.
Salihu A, Schwarz S, Rupp M. Self-supervised and invariant representations for wireless localization. arXiv. 2023 - 52.
Loshchilov I, Hutter F. Decoupled weight decay regularization. arXiv. 2017 - 53.
Van der Maaten L, Hinton G. Visualizing data using t-sne. Journal of Machine Learning Research. 2008; 9
Notes
- https://github.com/ars205/ssl_wireless