ICA | 用RNN-ICA探索功能核磁内在网络模型的时空动力学

机器学习炼丹术

发布于 2023-03-16 21:24:01

5570

发布于 2023-03-16 21:24:01

文章被收录于专栏：机器学习炼丹术机器学习炼丹术

论文名称：Spatio-temporal Dynamics of Intrinsic Networks in Functional Magnetic Imaging Data Using Recurrent Neural Networks

image.png

Introduction

有很多的方法可以分析MRI，其中有一种方法是主成分分析法independent component analysis ICA 1995。它假设数据data is a mixture of maximally independent sources. ICA is trainable through one of many relatively simple optimization routines that maximize non-Gaussianity or minimize mutual information.然而ICA和其他的方法一样，在时间上的顺序是不可知的。每一个时间step的多元信号都被认为是独立同分布的。

?While model degeneracy in time is conventient for learning; as an assumption about the data the explicit lack of temporal dependence necessarily marginalizes out dynamics, which then must be extrapolated in post-boc analysis.

Background

?Here we will formalize the problem of source separation with temporal dependencies and formulated the solution in terms of maximum likelihood estimation (MLE) and a recurrent model taht parameterizes a conditionally independent distribution

The data is composed of N ordered sequences of lengt T.

image.png

where each element in the sequence

x_{t,n}

is a D dimensional vector, and the index n enumerates the whole sequence.

The gold is to find a set of source signals:

image.png

这里还可以构建子序列：

image.png

我看到这里，我对这个问题是什么还是一无所知。。。

?This problem can generally be understood as inderence of unobserved or latent configurations from time-series observations.

这个问题通常可以理解为从时间序列观察中推断出未观察到或潜在的配置。

?It is convencient to assume that the sources,

S_n

, are stochastic random variables with well-understood and interpretable noise, suhc as Gaussian or logistic variables with independence constrains.

?Representable as a directed graphical model in time, the choice of a-priori model structure, such as the relationship between latent variables and observations, can have consequences on model capacity and inference complexity.

可以及时表示为有向图模型，先验模型结构的选择，例如潜在变量和观察值之间的关系，会对模型容量和推理复杂性产生影响。潜在变量和观察值这个好像能理解X和S之间的关系？

?Directed graphical models often require complex approximate inference which introduces variance into learning. Rather than solving the general problem in Equation 3. We will assume that the generating function, G() is noiseless, and the source sequences,

S_n

have the same dimensionality as the data

X_n

, with each source signal being composed of a set of conditionally independent components with density parameterized by a recurrent neural network(RNN)

有向图模型通常需要复杂的近似推理，这会在学习中引入方差。而不是解决等式 3 中的一般问题。我们将假设生成函数 G() 是无噪声的，并且源序列

S_n

与数据

X_n

具有相同的维度，每个源信号由一组条件独立的组件，其密度由递归神经网络 (RNN) 参数化

?We will show that the learning objective closely resembles that of noiseless independent component analysis (ICA). Assuming generation is noiseless and preserves dimensionality will reudce variance which would otherswise hinder learning with high-dimensional, low-sample size data, such as fMRI.

我们将证明学习目标与无噪声独立成分分析 (ICA) 非常相似。假设生成是无噪声的并且保留维度将减少方差，否则方差会阻碍使用高维、低样本量数据（例如 fMRI）进行学习。

Independent component analysis

参考资料：

[2013.11.29 Lesson9-session2]多變數分析-獨立成分分析ICA - YouTube
(1298条消息) 独立成分分析ICA原理_蔡希玉的博客-CSDN博客_ica原理
(1298条消息) ICA与PCA的区别_psybrain的博客-CSDN博客_ica和pca
ICA又称为盲源分离Blind source separation BBS。
ICA是independent component analysis独立成分分析的缩写。
用鸡尾酒会模型来做比喻，假设我们在一个音乐厅或者是一个舞会，麦克风放在舞台的各个位置，每个麦克风都会捕获到混合的原始信号，有多少个麦克风就会有多少个混合信号。ICA的目标就是将混合信号分离提取或重建成非混合信号。

从数学上说，ICA是一种线性变换，和PCA是一样的。这个变换把数据或信号分离成统计独立的非高斯分布的信号源的线性组合。可以证明只要源信号非高斯，这种分解就是唯一的

image.png

可以看到，A就是将源信号s进行线性组合，得到了观测信号x。ICA的目的就是通过x来估计混合矩阵A和源信号s。

【ICP vs PCA】 ICA是一种将数据乘以一个分解矩阵来恢复源数据的方法，而PCA是对输出进行去相关，让每一个连续分量尽可能多的解释数据中的方差。ICA则试图输出具有统计意义上的独立，使得每一个分量尽可能多的反应数据中与时间无关的信息。

ICA需要预先定义分解的独立源的数目，及需要用户对数据有一个先验知识，掌握一定的数据特征，不能随意选择。而PCA的计算过程是完全无参的。

一般认为，PCA假设源信号彼此非相关，PCA的源信号其实就是主成分的方向，不相关其实就是只主成分方向正交
ICA假设源信号彼此独立。因为ICA分解的源信号需要保持统计上的独立。
主成分分析认为主元之间彼此正交，样本呈高斯分布，独立成分分析则要求数据非高斯分布。
PCA的目的是找到信号当中的不相关部分（正交性），对应二阶统计量（最大方差）。PCA的实现就像我们之前讲的，两种：特征值分解和SVD分解。PCA的问题就是对向量描述的基的变换，让变换后的数据有着最大的方差。方差的大小是描述一个变量的信息量。
ICA是找出构成信号的相互独立的部分，并不要求正交，对应高阶统计量分析。ICA的理论认为用来观测的混合矩阵X是由独立源A经过线性加权获得的。ICA的目标就是通过X求取一个分离矩阵W，使得W作用在X上获得的结果是独立源S的最优逼近。

x=AS,A=W^{-1},WX=Y,Y=\hat{A}

与PCA不同，ICA的目标不在于降低目标的维度，而是尽可能的从混合讯号中找出更具生理或者物理意义的信号来源。

【ICA的假设】

假设源信号是各自独立的；也就是共同分布是各自分布的乘积
假设源信号分布是非高斯分布。

python实现ICA

关于ICA的理论数学推导比较复杂。之后需要的话专门开一个坑研究。先看一下iCA的python实现：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import FastICA

# 构建四个不同的信号
C=200
x=np.arange(C)
s1 = 2 * np.sin(0.02 * np.pi * x)
a = np.linspace(-2,2,25)
s2 = np.concatenate([a,a,a,a,a,a,a,a])
s3 = np.array(20*(5*[2]+5*[-2]))
s4 = np.random.random(C)

# 展示信号
ax1 = plt.subplot(411)
ax2 = plt.subplot(412)
ax3 = plt.subplot(413)
ax4 = plt.subplot(414)
ax1.plot(s1)
ax2.plot(s2)
ax3.plot(s3)
ax4.plot(s4)

image.png

这四个波分别是正弦波、锯齿波、方波和随机信号，四个代表互相独立的源信号。


s = np.array([s1,s2,s3,s4])
ran = np.random.random([4,4])
mix = np.dot(ran,s)
ax1 = plt.subplot(411)
ax2 = plt.subplot(412)
ax3 = plt.subplot(413)
ax4 = plt.subplot(414)
ax1.plot(mix[0])
ax2.plot(mix[1])
ax3.plot(mix[2])
ax4.plot(mix[3])

将四个源信号经过随机混合出来四个观测信号。

image.png

ica = FastICA(n_components=4)
u = ica.fit_transform(mix.T)
print(ica.n_iter_)
ax1 = plt.subplot(411)
ax2 = plt.subplot(412)
ax3 = plt.subplot(413)
ax4 = plt.subplot(414)
ax1.plot(u[:,0])
ax2.plot(u[:,1])
ax3.plot(u[:,2])
ax4.plot(u[:,3])

通过sklearn的fastICA进行独立成分分解。发现结果也是和源信号非常类似的

ica = FastICA(n_components=4)
ica.fit(mix.T)
w = ica.components_
u = np.dot(w,mix)
ax1 = plt.subplot(411)
ax2 = plt.subplot(412)
ax3 = plt.subplot(413)
ax4 = plt.subplot(414)
ax1.plot(u[0])
ax2.plot(u[1])
ax3.plot(u[2])
ax4.plot(u[3])

这个是用ica.components_，这个就是用来从混合信号中解耦出独立源信号的矩阵。