当前位置：首页 > news >正文

四惠网站建设安卓wordpress

news 2025/11/17 14:50:45

四惠网站建设,安卓wordpress,国内网络营销公司排名,装修网站排名前十名注#xff1a;本文为 “标量、向量、矩阵和张量的区别” 相关合辑。英文引文#xff0c;机翻未校。如有内容异常#xff0c;请看原文。 Difference Between Scalar, Vector, Matrix and Tensor 标量、向量、矩阵和张量的区别 Last Updated : 06 Aug, 2025 In the conte…注本文为 “标量、向量、矩阵和张量的区别” 相关合辑。英文引文机翻未校。如有内容异常请看原文。 Difference Between Scalar, Vector, Matrix and Tensor 标量、向量、矩阵和张量的区别 Last Updated : 06 Aug, 2025 In the context of mathematics and machine learning, scalar, vector, matrix, and tensor are all different types of mathematical objects that represent different concepts and have different properties. Here in this article, we will discuss in detail scalars, vectors, matrixes, tensors, and finally the differences between them. 在数学和机器学习的背景下标量、向量、矩阵和张量是不同类型的数学对象它们代表不同的概念并具有不同的属性。在本文中我们将详细讨论标量、向量、矩阵、张量以及它们之间的区别。 What is Scalar? 什么是标量 Scalars are singular numerical entities within the realm of Data Science, devoid of any directional attributes. 在数据科学领域标量是单一的数值实体没有任何方向属性。 They serve as the elemental components utilized in mathematical computations and algorithmic frameworks across these domains. In practical terms, scalars often represent fundamental quantities such as constants, probabilities, or error metrics. 它们是这些领域中用于数学计算和算法框架的基本组件。在实际中标量通常表示基本量例如常数、概率或误差指标。 For instance, within Machine Learning, a scalar may denote the accuracy of a model or the value of a loss function. Similarly, in Data Science, scalars are employed to encapsulate statistical metrics like mean, variance, or correlation coefficients. Despite their apparent simplicity, scalars assume a critical role in various AI-ML-DS tasks, spanning optimization, regression analysis, and classification algorithms. Proficiency in understanding scalars forms the bedrock for comprehending more intricate concepts prevalent in these fields. 例如在机器学习中标量可以表示模型的准确率或损失函数的值。同样在数据科学中标量用于封装诸如均值、方差或相关系数等统计指标。尽管标量看起来很简单但它们在各种人工智能 - 机器学习 - 数据科学任务中起着关键作用涵盖优化、回归分析和分类算法。掌握标量的理解是理解这些领域中更复杂概念的基础。 In Python we can represent a Scalar like: 在 Python 中我们可以这样表示标量 # Scalars can be represented simply as numerical variables // 标量可以简单地表示为数值变量 scalar 8.4 scalarOutput: 输出 8.4What are Vectors? 什么是向量 Vectors, within the context of Data Science, represent ordered collections of numerical values endowed with both magnitude and directionality. They serve as indispensable tools for representing features, observations, and model parameters within AI-ML-DS workflows. 在数据科学的背景下向量是具有大小和方向的数值有序集合的表示。它们是人工智能 - 机器学习 - 数据科学工作流中用于表示特征、观测值和模型参数的不可或缺的工具。 In Artificial Intelligence, vectors find application in feature representation, where each dimension corresponds to a distinct feature of the dataset. 在人工智能中向量用于特征表示其中每个维度对应数据集的一个不同特征。 In Machine Learning, vectors play a pivotal role in encapsulating data points, model parameters, and gradient computations during the training process. Moreover, within DS, vectors facilitate tasks like data visualization, clustering, and dimensionality reduction. Mastery over vector concepts is paramount for engaging in activities like linear algebraic operations, optimization via gradient descent, and the construction of complex neural network architectures. 在机器学习中向量在封装数据点、模型参数以及训练过程中的梯度计算方面发挥着关键作用。此外在数据科学中向量有助于数据可视化、聚类和降维等任务。掌握向量概念对于进行线性代数运算、通过梯度下降进行优化以及构建复杂神经网络架构等活动至关重要。 In Python we can represent a Vector like: 在 Python 中我们可以这样表示向量 import numpy as np # Vectors can be represented as one-dimensional arrays // 向量可以表示为一维数组 vector np.array([2, -3, 1.5]) vectorOutput: 输出 array([ 2. , -3. , 1.5])What are Matrices? 什么是矩阵 Matrices, as two-dimensional arrays of numerical values, enjoy widespread utility across AI-ML-DS endeavors. They serve as foundational structures for organizing and manipulating tabular data, wherein rows typically represent observations and columns denote features or variables. 作为数值的二维数组矩阵在人工智能 - 机器学习 - 数据科学领域中被广泛使用。它们是组织和操作表格数据的基础结构其中行通常表示观测值列表示特征或变量。 Matrices facilitate a plethora of statistical operations, including matrix multiplication, determinant calculation, and singular value decomposition. 矩阵促进了包括矩阵乘法、行列式计算和奇异值分解在内的大量统计运算。 In the domain of AI, matrices find application in representing weight matrices within neural networks, with each element signifying the synaptic connection strength between neurons. Similarly, within ML, matrices serve as repositories for datasets, building kernel matrices for support vector machines, and implementing dimensionality reduction techniques such as principal component analysis. Within DS, matrices are indispensable for data preprocessing, transformation, and model assessment tasks. 在人工智能领域矩阵用于表示神经网络中的权重矩阵其中每个元素表示神经元之间的突触连接强度。同样在机器学习中矩阵作为数据集的存储库用于构建支持向量机的核矩阵以及实现主成分分析等降维技术。在数据科学中矩阵对于数据预处理、转换和模型评估任务是不可或缺的。 In Python we can represent a Matrix like: 在 Python 中我们可以这样表示矩阵 import numpy as np # Matrices can be represented as two-dimensional arrays // 矩阵可以表示为二维数组 matrix np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])Output: 输出 array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])What are Tensors? 什么是张量 Tensors in Data Science generalize the concept of vectors and matrices to higher dimensions. They are multidimensional arrays of numerical values which can constitute complex data structures and relationships. 在数据科学中张量将向量和矩阵的概念推广到更高维度。它们是数值的多维数组可以构成复杂的数据结构和关系。 Tensors are integral in deep learning frameworks like TensorFlow and PyTorch, in which they may be used to store and manipulate multi-dimensional data such as images, videos, and sequences. 张量是深度学习框架如 TensorFlow 和 PyTorch的重要组成部分它们可用于存储和操作多维数据例如图像、视频和序列。 In AI, tensors are employed for representing input data, model parameters, and intermediate activations in neural networks. In ML, tensors facilitate operations in convolutional neural networks, recurrent neural networks, and transformer architectures. Moreover, in DS, tensors are utilized for multi-dimensional data analysis, time - series forecasting, and natural language processing tasks. Understanding tensors is crucial for advanced AI-ML-DS practitioners, as they allow the modeling and analysis of intricate data patterns and relationships across multiple dimensions. 在人工智能中张量用于表示神经网络中的输入数据、模型参数和中间激活。在机器学习中张量有助于卷积神经网络、循环神经网络和变换器架构中的操作。此外在数据科学中张量用于多维数据分析、时间序列预测和自然语言处理任务。对于高级人工智能 - 机器学习 - 数据科学从业者来说理解张量至关重要因为它们允许对多维度中的复杂数据模式和关系进行建模和分析。 In Python we can represent a Tensor like: 在 Python 中我们可以这样表示张量 import numpy as np # Tensors can be represented as multi-dimensional arrays // 张量可以表示为多维数组 tensor np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) tensorOutput: 输出 array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])Scalar Vs Vector Vs Matrix Vs Tensor 标量与向量与矩阵与张量 AspectScalar标量Vector向量Matrix矩阵Tensor张量Dimensionality维度012≥ 3Representation表示Single numerical value单个数值Ordered array of values有序值数组Two - dimensional array of values二维值数组Multidimensional array of values多维值数组Usage用途Represent basic quantities表示基本量Represent features, observations表示特征、观测值Organize data in tabular format以表格格式组织数据Handle complex data structures处理复杂数据结构Examples示例Error metrics, probabilities误差指标、概率Feature vectors, gradients特征向量、梯度Data matrices, weight matrices数据矩阵、权重矩阵Image tensors, sequence tensors图像张量、序列张量Manipulation操作Simple arithmetic operations简单算术运算Linear algebra operations线性代数运算Matrix operations, linear transformations矩阵运算、线性变换Tensor operations, deep learning operations张量运算、深度学习运算Data Representation数据表示Point in space空间中的点Direction and magnitude in space空间中的方向和大小Rows and columns in tabular format表格格式中的行和列Multi - dimensional relationships多维关系Applications应用Basic calculations, statistical measures基本计算、统计量Machine learning models, data representation机器学习模型、数据表示Data manipulation, statistical analysis数据操作、统计分析Deep learning, natural language processing深度学习、自然语言处理Notation符号表示Lowercase letters or symbols小写字母或符号Boldface letters or arrows粗体字母或箭头Uppercase boldface letters大写粗体字母Boldface uppercase letters or indices粗体大写字母或索引 Conclusion 结论 We can conclude that the understanding of scalars, vectors, matrices, and tensors is paramount in the fields of Data Science, as they serve as fundamental building blocks for mathematical representation, computation, and analysis of data and models. Scalars, representing single numerical values, play a foundational role in basic calculations and statistical measures. Vectors, with their magnitude and direction, enable the representation of features, observations, and model parameters, crucial for machine learning tasks. Matrices organize data in a tabular format, facilitating operations like matrix multiplication and linear transformations, essential for statistical analysis and machine learning algorithms. Tensors, extending the concept to higher dimensions, handle complex data structures and relationships, powering advanced techniques in deep learning and natural language processing. Mastery of these mathematical entities empowers practitioners to model and understand intricate data patterns and relationships, driving innovation and advancement in AI, ML, and DS domains. 我们可以得出结论理解标量、向量、矩阵和张量在数据科学领域至关重要因为它们是数据和模型的数学表示、计算和分析的基本构建块。标量表示单个数值在基本计算和统计量中起基础作用。具有大小和方向的向量能够表示特征、观测值和模型参数这对于机器学习任务至关重要。矩阵以表格格式组织数据便于进行矩阵乘法和线性变换等操作这对于统计分析和机器学习算法是必不可少的。张量将概念扩展到更高维度处理复杂的数据结构和关系推动深度学习和自然语言处理中的高级技术。掌握这些数学实体使从业者能够对复杂的数据模式和关系进行建模和理解推动人工智能、机器学习和数据科学领域的创新和进步。 What’s the difference between a matrix and a tensor? 矩阵和张量有什么区别 Aug 28,2017 Steven Steinke There is a short answer to this question,so let’s start there.Then we can take a look at an application to get a little more insight. 这个问题有一个简短的答案让我们从那里开始。然后我们可以看看一个应用以获得更多的见解。 A matrix is a grid of n×mn \times mn×m (say, 3×33 \times 33×3) numbers surrounded by brackets.We can add and subtract matrices of the same size,multiply one matrix with another as long as the sizes are compatible ((n×m)×(m×p)n×p)((n \times m) \times (m \times p) n \times p )((n×m)×(m×p)n×p),and multiply an entire matrix by a constant.A vector is a matrix with just one row or column (but see below).So there are a bunch of mathematical operations that we can do to any matrix. 矩阵是一个由括号包围的 n×mn \times mn×m比如说3×33 \times 33×3数字网格。我们可以对相同大小的矩阵进行加法和减法运算只要大小兼容 ((n×m)×(m×p)n×p)((n \times m) \times (m \times p) n \times p )((n×m)×(m×p)n×p)就可以将一个矩阵与另一个矩阵相乘还可以将整个矩阵乘以一个常数。向量是一个只有一行或一列的矩阵但见下文。因此我们可以对任何矩阵进行许多数学运算。 The basic idea,though,is that a matrix is just a 2 - D grid of numbers. 然而基本的想法是矩阵只是一个二维数字网格。 A tensor is often thought of as a generalized matrix.That is,it could be a 1 - D matrix (a vector is actually such a tensor),a 3 - D matrix (something like a cube of numbers),even a 0 - D matrix (a single number),or a higher dimensional structure that is harder to visualize.The dimension of the tensor is called its rank. 张量通常被认为是一种广义矩阵。也就是说它可以是一个一维矩阵向量实际上就是这样的张量、一个三维矩阵类似于一个数字立方体、甚至是一个零维矩阵一个单独的数字或者是一个更难可视化的更高维结构。张量的维度被称为它的秩。 But this description misses the most important property of a tensor! 但这种描述遗漏了张量最重要的属性 A tensor is a mathematical entity that lives in a structure and interacts with other mathematical entities.If one transforms the other entities in the structure in a regular way,then the tensor must obey a related transformation rule. 张量是一种生活在结构中并与其他数学实体相互作用的数学实体。如果以一种规律的方式转换结构中的其他实体那么张量必须遵循相关的转换规则。 This dynamicalproperty of a tensor is the key that distinguishes it from a mere matrix.It’s a team player whose numerical values shift around along with those of its teammates when a transformation is introduced that affects all of them. 这种张量的“动态”属性是区分它与普通矩阵的关键。它是一个团队合作者当引入影响它们所有人的转换时它的数值会随着队友的数值而变化。 Any rank - 2 tensor can be represented as a matrix,but not every matrix is really a rank - 2 tensor.The numerical values of a tensor’s matrix representation depend on what transformation rules have been applied to the entire system. 任何二阶张量都可以表示为矩阵但并非每个矩阵都是真正的二阶张量。张量的矩阵表示的数值取决于已应用于整个系统的转换规则。 This answer might be enough for your purposes,but we can do a little example to illustrate how this works.The question came up in a Deep Learning workshop,so let’s look at a quick example from that field. 这个答案可能足以满足你的需求但我们可以做一个小例子来说明这是如何工作的。这个问题是在一个深度学习研讨会上提出的因此让我们看看该领域的快速示例。 Suppose I have a hidden layer of 3 nodes in a neural network.Data flowed into them,went through their ReLU functions,and out popped some values.Let’s say,for definiteness,we got 2.5,4,and 1.2,respectively.(Don’t worry,a diagram is coming.)We could represent these nodes’ output as a vector, 假设我在一个神经网络中有一个包含 3 个节点的隐藏层。数据流入它们经过它们的 ReLU 函数然后弹出一些值。为了明确起见我们得到了 2.5、4 和 1.2。别担心图表马上就会出现。我们可以用一个向量来表示这些节点的输出 L1[2.541.2]{{L}_{1}}\left[ \begin{matrix} 2.5 \\ 4 \\ 1.2 \\\end{matrix} \right]L12.541.2 Let’s say there’s another layer of 3 nodes coming up.Each of the 3 nodes from the first layer has a weight associated with its input to each of the next 3 nodes.It would be very convenient,then,to write these weights as a 3×33 \times 33×3 matrix of entries.Suppose we’ve updated the network already many times and arrived at the weights (chosen semi - randomly for this example), 假设接下来还有一个包含 3 个节点的层。第一层的每个 3 个节点都有一个权重与其输入到下一层的每个 3 个节点相关联。那么将这些权重写成一个 3×33 \times 33×3 的矩阵条目将非常方便。假设我们已经多次更新了网络并得到了权重为了这个例子半随机选择的 W12[−10.41.50.80.50.750.2−0.31]{{W}_{12}}\left[ \begin{matrix} -1 0.4 1.5 \\ 0.8 0.5 0.75 \\ 0.2 -0.3 1 \\ \end{matrix} \right]W12−10.80.20.40.5−0.31.50.751 Here,the weights from one row all go to the same node in the next layer,and those in a particular column all come from the same node in the first layer.For example,the weight that incoming node 1 contributes to outgoing node 3 is 0.2 (row 3,col 1). 在这里一行中的权重都流向下一层次的同一个节点而特定列中的权重都来自第一层的同一个节点。例如输入节点 1 对输出节点 3 的贡献权重是 0.2第 3 行第 1 列。 We can compute the total values fed into the next layer of nodes by multiplying the weight matrix by the input vector, 我们可以通过将权重矩阵乘以输入向量来计算输入到下一层节点的总值 W12L1L2→[−10.41.50.80.50.750.2−0.31][2.541.2][0.94.90.5]{{W}_{12}}{{L}_{1}}{{L}_{2}}\to \left[ \begin{matrix} -1 0.4 1.5 \\ 0.8 0.5 0.75 \\ 0.2 -0.3 1 \\ \end{matrix} \right]\left[ \begin{matrix} 2.5 \\ 4 \\ 1.2 \\ \end{matrix} \right]\left[ \begin{matrix} 0.9 \\ 4.9 \\ 0.5 \\ \end{matrix} \right]W12L1L2→−10.80.20.40.5−0.31.50.7512.541.20.94.90.5 Don’t like matrices?Here’s a diagram.The data flow from left to right. 不喜欢矩阵这里有一个图表。数据从左向右流动。 Great! So far,all we have seen are some simple manipulations of matrices and vectors. 太好了到目前为止我们看到的都是一些矩阵和向量的简单操作。 Suppose I want to meddle around and use custom activation functions for each neuron.A dumb way to do this would be to rescale each of the ReLU functions from the first layer individually.For the sake of this example,let’s suppose I scale the first node up by a factor of 2,leave the second node alone,and scale the third node down by 1/5.This would change the graphs of these functions as pictured below: 假设我想干预一下为每个神经元使用自定义激活函数。一个愚蠢的方法是分别重新调整第一层的每个 ReLU 函数的大小。为了这个例子假设我将第一个节点的大小增加 2 倍保持第二个节点不变并将第三个节点缩小 1/5。这将改变这些函数的图表如下图所示 The effect of this modification is to change the values spit out by the first layer by factors of 2, 1, and 1/5, respectively. That’s equivalent to multiplying L1L_1L1 by a matrix AAA , 这种修改的效果是将第一层输出的值分别按 2、1 和 1/5 的比例进行改变。这相当于将 L1L_1L1 乘以一个矩阵 AAA AL1L1′→[200010000.2][2.541.2][540.24]A{{L}_{1}}{{L}_{1}}^{}\to \left[ \begin{matrix} 2 0 0 \\ 0 1 0 \\ 0 0 0.2 \\ \end{matrix} \right]\left[ \begin{matrix} 2.5 \\ 4 \\ 1.2 \\ \end{matrix} \right]\left[ \begin{matrix} 5 \\ 4 \\ 0.24 \\ \end{matrix} \right]AL1L1′→200010000.22.541.2540.24 Now,if these new values are fed through the original network of weights,we get totally different output values,as illustrated in the diagram: 现在如果将这些新值通过原始权重网络我们将得到完全不同的输出值如图所示 If the neural network were functioning properly before,we’ve broken it now.We’ll have to rerun the training to get the correct weights back. 如果神经网络之前运行正常我们现在破坏了它。我们将不得不重新运行训练以恢复正确的权重。 Or will we? 还是这样吗 The value at the first node is twice as big as before.If we cut all of its outgoing weights by 1/2,its net contribution to the next layer is unchanged.We didn’t do anything to the second node,so we can leave its weights alone.Lastly,we’ll need to multiply the final set of weights by 5 to compensate for the 1/5 factor on that node.This is equivalent,mathematically speaking,to using a new set of weights which we obtain by multiplying the original weight matrix by the inverse matrix of AAA : 第一个节点的值是之前的两倍。如果我们将其所有输出权重减半它对下一层的净贡献保持不变。我们没有对第二个节点做任何事情所以我们可以保留其权重不变。最后我们需要将最终一组权重乘以 5以补偿该节点的 1/5 因子。从数学上讲这相当于使用一组新的权重我们通过将原始权重矩阵乘以 AAA 的逆矩阵来获得 W12A−1W12′[−10.41.50.80.50.750.2−0.31][0.500010005][−0.50.47.50.40.53.750.1−0.35]{{W}_{12}}{{A}^{-1}}{{W}_{12}}^{}\left[ \begin{matrix} -1 0.4 1.5 \\ 0.8 0.5 0.75 \\ 0.2 -0.3 1 \\ \end{matrix} \right]\left[ \begin{matrix} 0.5 0 0 \\ 0 1 0 \\ 0 0 5 \\ \end{matrix} \right]\left[ \begin{matrix} -0.5 0.4 7.5 \\ 0.4 0.5 3.75 \\ 0.1 -0.3 5 \\ \end{matrix} \right]W12A−1W12′−10.80.20.40.5−0.31.50.7510.500010005−0.50.40.10.40.5−0.37.53.755 If we combine the modified output of the first layer with the modified weights,we end up with the correct values reaching the second layer: 如果我们把第一层修改后的输出和修改后的权重结合起来我们就会得到到达第二层的正确值 Hurray!The network is working again,despite our best efforts to the contrary! 太好了尽管我们尽力破坏但网络又开始工作了 OK,there’s been a ton of math,so let’s just sit back for a second and recap. 好吧已经有很多数学内容了让我们稍作休息回顾一下。 When we thought of the node inputs,outputs and weights as fixed quantities,we called them vectors and matrices and were done with it. 当我们把节点输入、输出和权重视为固定量时我们称它们为向量和矩阵然后就结束了。 But once we started monkeying around with one of the vectors,transforming it in a regular way,we had to compensate by transforming the weights in an opposite manner.This added,integrated structure elevates the mere matrix of numbers to a true tensor 但一旦我们开始以一种规律的方式调整其中一个向量我们就必须通过以相反的方式转换权重来进行补偿。这种增加的、整合的结构将单纯的数字矩阵提升为真正的张量。 In fact,we can characterize its tensor nature a little bit further.If we call the changes made to the nodes covariant (ie,varying with the node and multiplied by AAA ),that makes the weights a contravariant tensor (varying against the nodes,specifically,multiplied by the inverse of AAA instead of AAA itself).A tensor can be covariant in one dimension and contravariant in another,but that’s a tale for another time. 事实上我们可以进一步描述它的张量特性。如果我们把对节点所做的改变称为协变的即与节点一起变化并乘以 AAA 那么这就使得权重成为一个逆变张量与节点相反变化具体来说乘以 AAA 的逆矩阵而不是 AAA 本身。张量可以在一个维度上是协变的在另一个维度上是逆变的但这是另一个故事了。 And now you know the difference between a matrix and a tensor. 现在你知道矩阵和张量的区别了。 What are the Differences Between a Matrix and a Tensor? 矩阵与张量的区别是什么 asked Jun 5, 2013 at 21:52 Aurelius What is the difference between a matrix and a tensor? Or, what makes a tensor, a tensor? I know that a matrix is a table of values, right? But, a tensor? 矩阵和张量的区别是什么或者说是什么让一个张量成为张量我知道矩阵是一个数值表格对吗但张量呢 Continuing with your analogy, a matrix is just a two-dimensional table to organize information and a tensor is just its generalization. You can think of a tensor as a higher-dimensional way to organize information. So a matrix (5x5 for example) is a tensor of rank 2. And a tensor of rank 3 would be a “3D-matrix” like a 5x5x5 matrix. 顺着你的类比来说矩阵只是一个用于组织信息的二维表格而张量是它的推广。你可以把张量看作是一种更高维度的信息组织方式。因此一个矩阵例如 5×5 的矩阵是秩为 2 的张量。而秩为 3 的张量可以是一个“三维矩阵”比如 5×5×5 的矩阵。 I thought a rank n tensor is multi-linear function which takes n vectors and returns a vector of the same vector space? 我认为 n 秩张量是一个多线性函数它接收 n 个向量并返回同一个向量空间中的一个向量 One point of confusion is that in machine learning, people often use the term “tensor” when they really mean just “multidimensional array”. I disagree that a tensor of rank 3 would be a “3D matrix”, but admittedly it’s not uncommon to hear the word “tensor” used in this way. (http://stats.stackexchange.com/questions/198061/why-the-sudden-fascination-with-tensors/198127) 一个容易混淆的点是在机器学习中人们经常使用“张量”这个术语但实际上他们指的只是“多维数组”。我不认同秩为 3 的张量是“三维矩阵”这一说法但不可否认这种“张量”的用法并不少见。 Maybe to see the difference between rank 2 tensors and matrices, it is probably best to see a concrete example. Actually this is something which back then confused me very much in the linear algebra course (where we didn’t learn about tensors, only about matrices). 或许要理解秩为 2 的张量和矩阵之间的区别最好看一个具体的例子。实际上这是我当时在线性代数课程中那门课我们没有学张量只学了矩阵非常困惑的事情。 As you may know, you can specify a linear transformation between vectors by a matrix. Let’s call that matrix AAA. Now if you do a basis transformation, this can also be written as a linear transformation, so that if the vector in the old basis is vvv, the vector in the new basis is T−1vT^{-1}vT−1v (where vvv is a column vector). Now you can ask what matrix describes the transformation in the new basis. Well, it’s the matrix T−1ATT^{-1}ATT−1AT. 如你所知你可以用一个矩阵来表示向量之间的线性变换。我们称这个矩阵为 AAA。现在如果你进行基变换这也可以写成一个线性变换因此如果旧基下的向量是 vvv那么新基下的向量是 T−1vT^{-1}vT−1v其中 vvv 是列向量。现在你可能会问在新基下描述这个变换的矩阵是什么。答案是这个矩阵是 T−1ATT^{-1}ATT−1AT。 Well, so far, so good. What I memorized back then is that under basis change a matrix transforms as T−1ATT^{-1}ATT−1AT. 到目前为止一切都还顺利。我当时记住的是在基变换下矩阵按照 T−1ATT^{-1}ATT−1AT 的方式变换。 But then, we learned about quadratic forms. Those are calculated using a matrix AAA as uTAvu^T A vuTAv. Still, no problem, until we learned about how to do basis changes. Now, suddenly the matrix did not transform as T−1ATT^{-1}ATT−1AT, but rather as TTATT^T A TTTAT. Which confused me like hell: how could one and the same object transform differently when used in different contexts? 但后来我们学习了二次型。二次型是用矩阵 AAA 按照 uTAvu^T A vuTAv 来计算的。这仍然没有问题直到我们学习了如何进行基变换。这时突然发现这个矩阵不再按照 T−1ATT^{-1}ATT−1AT 的方式变换而是按照 TTATT^T A TTTAT 的方式变换。这让我非常困惑同一个对象在不同情境下使用时怎么会有不同的变换方式呢 Well, the solution is: because we are actually talking about different objects! In the first case, we are talking about a tensor that takes vectors to vectors. In the second case, we are talking about a tensor that takes two vectors into a scalar, or equivalently, which takes a vector to a covector. 答案是因为我们实际上在谈论不同的对象在第一种情况下我们谈论的是一个将向量映射到向量的张量。在第二种情况下我们谈论的是一个将两个向量映射到一个标量的张量或者等价地说是一个将向量映射到余向量的张量。 Now both tensors have n2n^2n2 components, and therefore it is possible to write those components in a n×nn \times nn×n matrix. And since all operations are either linear or bilinear, the normal matrix-matrix and matrix-vector products together with transposition can be used to write the operations of the tensor. Only when looking at basis transformations, you see that both are, indeed, not the same, and the course did us (well, at least me) a disservice by not telling us that we are really looking at two different objects, and not just at two different uses of the same object, the matrix. 现在这两种张量都有 n2n^2n2 个分量因此可以将这些分量写成一个 n×nn \times nn×n 的矩阵。而且由于所有运算要么是线性的要么是双线性的所以可以用常规的矩阵乘法、矩阵-向量乘法以及转置来表示张量的运算。只有在考察基变换时你才会发现它们确实不是同一个东西而这门课没有告诉我们其实我们看到的是两个不同的对象而不仅仅是同一个对象矩阵的两种不同用法这对我们至少对我是一种误导。 Indeed, speaking of a rank-2 tensor is not really accurate. The rank of a tensor has to be given by two numbers. The vector to vector mapping is given by a rank-(1,1) tensor, while the quadratic form is given by a rank-(0,2) tensor. There’s also the type (2,0) which also corresponds to a matrix, but which maps two covectors to a number, and which again transforms differently. 实际上说“秩为 2 的张量”并不十分准确。张量的秩必须由两个数来表示。将向量映射到向量的是 (1,1) 型张量而二次型由 (0,2) 型张量表示。还有 (2,0) 型张量它也对应一个矩阵但它将两个余向量映射到一个数并且其变换方式也不同。 The bottom line of this is: 归根结底 The components of a rank-2 tensor can be written in a matrix. 秩为 2 的张量的分量可以写成一个矩阵。 The tensor is not that matrix, because different types of tensors can correspond to the same matrix. 张量不是那个矩阵因为不同类型的张量可以对应同一个矩阵。 The differences between those tensor types are uncovered by the basis transformations (hence the physicist’s definition: “A tensor is what transforms like a tensor”). 这些张量类型之间的差异在基变换中会显现出来因此物理学家的定义是“张量是按照张量的方式变换的东西”。 Of course, another difference between matrices and tensors is that matrices are by definition two-index objects, while tensors can have any rank. 当然矩阵和张量之间的另一个区别是矩阵本质上是双指标对象而张量可以有任意的秩。 This is a great answer, because it reveals the question to be wrong. In fact, a matrix is not even a matrix, much less a tensor. 这是一个很棒的回答因为它揭示了这个问题本身是错误的。事实上矩阵甚至都不是矩阵更不用说是张量了。 I’m happy with the first sentence in your comment RyanReich but utterly confused by: “a matrix is not even a matrix”. Could you elaborate or point towards another source to explain this (unless I’ve taken it out of context?) Thanks. RyanReich我对你评论中的第一句话表示认同但完全被“矩阵甚至都不是矩阵”这句话搞糊涂了。你能详细解释一下吗或者指出另一个可以解释这一点的来源除非我断章取义了谢谢。 AJP It’s been a while, but I believe what I meant by that was that a matrix (array of numbers) is different from a matrix (linear transformation (1,1) tensor). The same array of numbers can represent several different basis-independent objects when a particular basis is chosen for them. AJP 过了一段时间了但我认为我的意思是数值数组形式的矩阵与作为线性变换的 (1,1) 型张量的矩阵是不同的。当为它们选择特定的基时同一个数值数组可以表示几个不同的与基无关的对象。 Proof basis change rule for quadratic form q[xyz][A][xyz]q [x\ y\ z][A]\begin{bmatrix}x\\y\\z\end{bmatrix}q[x y z][A]xyz 证明二次型 q[xyz][A][xyz]q [x\ y\ z][A]\begin{bmatrix}x\\y\\z\end{bmatrix}q[x y z][A]xyz 的基变换规则 with PPP being the change of basis matrix between x,y,zx,y,zx,y,z and u,v,wu,v,wu,v,w. 其中 PPP 是 x,y,zx,y,zx,y,z 和 u,v,wu,v,wu,v,w 之间的基变换矩阵。 q[xyz][xyz][xyz]⊤[A][xyz](P[uvw])⊤[A]P[uvw][uvw]P⊤[A]P[uvw]\begin{align*} q[x\ y\ z]\begin{bmatrix}x\\y\\z\end{bmatrix}\\ \begin{bmatrix}x\\y\\z\end{bmatrix}^\top[A]\begin{bmatrix}x\\y\\z\end{bmatrix}\\ \left(P\begin{bmatrix}u\\v\\w\end{bmatrix}\right)^\top[A]P\begin{bmatrix}u\\v\\w\end{bmatrix}\\ [u\ v\ w]P^\top[A]P\begin{bmatrix}u\\v\\w\end{bmatrix} \end{align*}q[x y z]xyzxyz⊤[A]xyzPuvw⊤[A]Puvw[u v w]P⊤[A]Puvw Indeed there are some “confusions” some people do when talking about tensors. This happens mainly on Physics where tensors are usually described as “objects with components which transform in the right way”. To really understand this matter, let’s first remember that those objects belong to the realm of linear algebra. Even though they are used a lot in many branches of mathematics the area of mathematics devoted to the systematic study of those objects is really linear algebra. 的确有些人在谈论张量时会存在一些“困惑”。这主要发生在物理学中在物理学里张量通常被描述为“其分量按照特定方式变换的对象”。要真正理解这个问题让我们首先记住这些对象属于线性代数的范畴。尽管它们在许多数学分支中被大量使用但专门系统研究这些对象的数学领域实际上是线性代数。 So let’s start with two vector spaces V,WV, WV,W over some field of scalars FFF. Now, let T:V→WT: V \to WT:V→W be a linear transformation. I’ll assume that you know that we can associate a matrix with TTT. Now, you might say: so linear transformations and matrices are all the same! And if you say that, you’ll be wrong. The point is: one can associate a matrix with TTT only when one fix some basis of VVV and some basis of WWW. In that case we will get TTT represented on those bases, but if we don’t introduce those, TTT will be TTT and matrices will be matrices (rectangular arrays of numbers, or whatever definition you like). 那么让我们从某个标量域 FFF 上的两个向量空间 V,WV, WV,W 开始。现在设 T:V→WT: V \to WT:V→W 是一个线性变换。我假设你知道我们可以将一个矩阵与 TTT 相关联。这时你可能会说所以线性变换和矩阵是一回事如果你这么说那你就错了。关键在于只有当我们固定 VVV 的某个基和 WWW 的某个基时才能将一个矩阵与 TTT 相关联。在这种情况下我们会得到 TTT 在这些基下的表示但如果不引入这些基TTT 就是 TTT而矩阵就是矩阵矩形的数值数组或任何你喜欢的定义。 Now, the construction of tensors is much more elaborate than just saying: “take a set of numbers, label by components, let they transform in the correct way, you get a tensor”. In truth, this “definition” is a consequence of the actual definition. Indeed the actual definition of a tensor is meant to introduce what we call “Universal Property”. 现在张量的构造远比“取一组数用分量标记让它们按照正确的方式变换你就得到了一个张量”这种说法要复杂得多。事实上这个“定义”是实际定义的一个推论。的确张量的实际定义是为了引入我们所说的“泛性质”。 The point is that if we have a collection of ppp vector spaces ViV_iVi and another vector space WWW we can form functions of several variables f:V1×⋯×Vp→Wf: V_1 \times \cdots \times V_p \to Wf:V1×⋯×Vp→W. A function like this will be called multilinear if it’s linear in each argument with the others held fixed. Now, since we know how to study linear transformations we ask ourselves: is there a construction of a vector space SSS and one universal multilinear map T:V1×⋯×Vp→ST: V_1 \times \cdots \times V_p \to ST:V1×⋯×Vp→S such that fg∘Tf g \circ Tfg∘T for some g:S→Wg: S \to Wg:S→W linear and such that this holds for all fff? If that’s always possible we’ll reduce the study of multilinear maps to the study of linear maps. 关键在于如果我们有 ppp 个向量空间 ViV_iVi 的集合和另一个向量空间 WWW我们可以构造多变量函数 f:V1×⋯×Vp→Wf: V_1 \times \cdots \times V_p \to Wf:V1×⋯×Vp→W。如果这样的函数在每个自变量上都是线性的其他自变量固定那么它就被称为多线性函数。现在由于我们知道如何研究线性变换我们会问自己是否存在一个向量空间 SSS 和一个“泛”多线性映射 T:V1×⋯×Vp→ST: V_1 \times \cdots \times V_p \to ST:V1×⋯×Vp→S使得对于某个线性映射 g:S→Wg: S \to Wg:S→W有 fg∘Tf g \circ Tfg∘T并且这对所有 fff 都成立如果这总是可能的我们就可以将多线性映射的研究简化为线性映射的研究。 The happy part of the story is that this is always possible, the construction is well defined and SSS is denoted V1⊗⋯⊗VpV_1 \otimes \cdots \otimes V_pV1⊗⋯⊗Vp and is called the tensor product of the vector spaces and the map TTT is the tensor product of the vectors. An element t∈St \in St∈S is called a tensor. Now it’s possible to prove that if ViV_iVi has dimension nin_ini then the following relation holds: 这个故事中令人愉快的部分是这总是可能的这个构造是定义良好的并且 SSS 记为 V1⊗⋯⊗VpV_1 \otimes \cdots \otimes V_pV1⊗⋯⊗Vp称为向量空间的张量积而映射 TTT 是向量的张量积。SSS 中的元素 t∈St \in St∈S 被称为张量。现在可以证明如果 ViV_iVi 的维数为 nin_ini则有以下关系成立 dim⁡(V1⊗⋯⊗Vp)∏i1pni\dim(V_1 \otimes \cdots \otimes V_p) \prod_{i1}^p n_idim(V1⊗⋯⊗Vp)∏i1pni This means that SSS has a basis with ∏i1pni\prod_{i1}^p n_i∏i1pni elements. In that case, as we know from basic linear algebra, we can associate with every t∈St \in St∈S its components in some basis. Now, those components are what people usually call “the tensor”. Indeed, when you see in Physics people saying: “consider the tensor TαβT_{\alpha\beta}Tαβ” what they are really saying is “consider the tensor TTT whose components in some basis understood by context are TαβT_{\alpha\beta}Tαβ”. 这意味着 SSS 有一个包含 ∏i1pni\prod_{i1}^p n_i∏i1pni 个元素的基。在这种情况下正如我们从基本线性代数中所知我们可以将每个 t∈St \in St∈S 与其在某个基下的分量相关联。现在这些分量就是人们通常所说的“张量”。事实上当你在物理学中看到人们说“考虑张量 TαβT_{\alpha\beta}Tαβ”时他们真正的意思是“考虑在上下文所理解的某个基下分量为 TαβT_{\alpha\beta}Tαβ 的张量 TTT”。 So if we consider two vector spaces V1V_1V1 and V2V_2V2 with dimensions respectivly nnn and mmm, by the result I’ve stated dim⁡(V1⊗V2)nm\dim(V_1 \otimes V_2) nmdim(V1⊗V2)nm, so for every tensor t∈V1⊗V2t \in V_1 \otimes V_2t∈V1⊗V2 one can associate a set of nmnmnm scalars (the components of ttt), and we are obviously allowed to plug those values into a matrix M(t)M(t)M(t) and so there’s a correspondence of tensors of rank 2 with matrices. 因此如果我们考虑两个向量空间 V1V_1V1 和 V2V_2V2它们的维数分别为 nnn 和 mmm根据我所陈述的结果dim⁡(V1⊗V2)nm\dim(V_1 \otimes V_2) nmdim(V1⊗V2)nm所以对于每个张量 t∈V1⊗V2t \in V_1 \otimes V_2t∈V1⊗V2我们可以将其与一组 nmnmnm 个标量ttt 的分量相关联显然我们可以将这些值放入一个矩阵 M(t)M(t)M(t) 中因此秩为 2 的张量与矩阵之间存在对应关系。 However, exactly as in the linear transformation case this correspondence is only possible when we have selected bases on the vector spaces we are dealing with. Finally, with every tensor it is possible to associate also a multilinear map. So tensors can be understood in their fully abstract and algebraic way as elements of the tensor product of vector spaces, and can also be understood as multilinear maps (this is better for intuition) and we can associate matrices to those. 然而正如在线性变换的情况下一样这种对应关系只有在我们为所处理的向量空间选择了基之后才有可能。最后每个张量还可以与一个多线性映射相关联。因此张量可以以完全抽象和代数的方式理解为向量空间张量积的元素也可以理解为多线性映射这更符合直觉并且我们可以将矩阵与这些张量相关联。 So after all this hassle with linear algebra, the short answer to your question is: matrices are matrices, tensors of rank 2 are tensors of rank 2, however there’s a correspondence between them whenever you fix a basis on the space of tensors. 因此在经历了这么多线性代数的麻烦之后对你的问题的简短回答是矩阵就是矩阵秩为 2 的张量就是秩为 2 的张量然而当你在张量空间上固定一个基时它们之间存在对应关系。 My suggestion is that you read Kostrikin’s “Linear Algebra and Geometry” chapter 4 on multilinear algebra. This book is hard, but it’s good to really get the ideas. Also, you can see about tensors (constructions in terms of multilinear maps) in good books of multivariable Analysis like “Calculus on Manifolds” by Michael Spivak or “Analysis on Manifolds” by James Munkres. 我的建议是你阅读柯斯特利金的《线性代数与几何》中关于多线性代数的第 4 章。这本书很难但对于真正理解这些概念很有帮助。此外你可以在好的多变量分析书籍中了解张量从多线性映射的角度构建例如迈克尔·斯皮瓦克的《流形上的微积分》或詹姆斯·芒克雷斯的《流形分析》。 I must be missing something, but can’t you just set SW,gIdWS W, g \text{Id}_WSW,gIdW? 我一定是漏掉了什么但你不能直接设 SW,gIdWS W, g \text{Id}_WSW,gIdW 吗 The point is that we want a space SSS constructed from the vector spaces ViV_iVi such that we can use it for all WWW. In other words, given just ViV_iVi we can build the pair (S,g)(S, g)(S,g) and use once and for all for any WWW and fff. That is why it is calles universal property. 关键在于我们想要一个由向量空间 ViV_iVi 构造的空间 SSS使得我们可以将它用于所有的 WWW。换句话说仅给定 ViV_iVi我们就可以构建出对 (S,g)(S, g)(S,g)并一劳永逸地用于任何 WWW 和 fff。这就是为什么它被称为泛性质。 As a place-holder answer waiting perhaps for clarification by the questioner’s (and others’) reaction: given that your context has a matrix be a table of values (which can be entirely reasonable)… 作为一个临时答案或许等待提问者和其他人的反馈来澄清假设在你的语境中矩阵是一个数值表格这完全是合理的…… In that context, a “vector” is a list of values, a “matrix” is a table (or list of lists), the next item would be a list of tables (equivalently, a table of lists, or list of lists of lists), then a table of tables (equivalently, a list of tables of lists, or list of lists of tables…). And so on. All these are “tensors”. 在这种语境下“向量”是一个数值列表“矩阵”是一个表格或列表的列表接下来的是表格的列表等价地列表的表格或列表的列表的列表然后是表格的表格等价地列表的表格的列表或列表的列表的表格……依此类推。所有这些都是“张量”。 Unsurprisingly, there are many more sophisticated viewpoints that can be taken, but perhaps this bit of sloganeering is useful? 不出所料可以有许多更复杂的观点但或许这种简单的说法是有用的 In addition to the answer of celtschk, this makes tensors make some sense to me (and their different ranks) 除了 celtschk 的回答之外这让我对张量以及它们的不同秩有了一些理解 So basically a tensor is an array of objects in programming. Tensor1 array. Tensor2 array of array. Tensor3 array of array of array. 所以基本上在编程中张量是对象的数组。1 阶张量数组。2 阶张量数组的数组。3 阶张量数组的数组的数组。 Pacerier, yes, from a programming viewpoint that would be a reasonable starter-version of what a tensor is. But, as noted in my answer, in various mathematical contexts there is complication, due, in effect, to “collapsing” in the indexing scheme. Pacerier是的从编程的角度来看这是对张量的一个合理的初步理解。但是正如我在回答中提到的在各种数学语境中由于索引方案中的“压缩”情况会变得复杂。 paulgarrett Can we say objects with a rank of more than 2 are tensors? or a scalar is also a tensor? paulgarrett 我们可以说秩大于 2 的对象是张量吗或者标量也是张量 Yes, we can (if we insist) say that a scalar is a tensor of rank 0. And, yes, there are higher-rank tensors: sometimes that higher rank is visible in the number of subscripts and/or superscripts they carry. Such things arise in geometry (and, thereby, in general relativity). 是的我们可以如果我们坚持的话说标量是秩为 0 的张量。而且是的存在更高秩的张量有时这种更高的秩可以从它们所带的下标和/或上标的数量中看出。这类东西出现在几何学中从而也出现在广义相对论中。 Tensors are objects whose transformation laws make them geometrically meaningful. Yes, I am a physicist, and to me, that is what a tensor is: there is a general idea that tensors are objects merely described using components with respect to some basis, and as coordinates change (and thus the associated basis changes), the tensor’s components should transform accordingly. What those laws are follows, then, from the chain rule of multivariable calculus, nothing more. 张量是其变换规律使其具有几何意义的对象。是的我是一名物理学家对我来说这就是张量的定义普遍的观点是张量只是用相对于某个基的分量来描述的对象并且当坐标变化时从而相关的基也变化时张量的分量应该相应地变换。这些规律源于多变量微积分的链式法则仅此而已。 What is a matrix? A representation of a linear map, also with respect to some basis. Thus, some tensors can be represented with matrices. 什么是矩阵矩阵是线性映射的一种表示也是相对于某个基的。因此一些张量可以用矩阵来表示。 Why some? Well, contrary to what you may have heard, not all tensors are inherently linear maps. Yes, you can construct a linear map from any tensor, but that is not what the tensor is. From a vector, you can construct a linear map acting on a covector to produce a scalar; this is where the idea comes from, but it’s misleading. Consider a different kind of quantity, representing an oriented plane. We’ll call it a bivector: From a bivector, you can construct a linear map taking in a covector and returning a vector, or a linear map taking two covectors and returning a scalar. 为什么是“一些”嗯与你可能听到的相反并非所有张量本质上都是线性映射。是的你可以从任何张量构造一个线性映射但这并不是张量本身。从一个向量你可以构造一个作用于余向量以产生标量的线性映射这就是这个想法的来源但它具有误导性。考虑一种不同的量它表示一个有向平面。我们称之为双向量从双向量你可以构造一个接收余向量并返回向量的线性映射或者一个接收两个余向量并返回标量的线性映射。 That you can construct multiple maps from a bivector should indicate that the bivector is, in itself, neither of these maps but a more fundamental geometric object. Bivectors are represented using antisymmetric 2-index tesors, or antisymmetric matrices. In fact, you can form bivectors from two vectors. While you can make that fit with the mapping picture, it starts to feel incredibly arbitrary. 你可以从双向量构造多个映射这表明双向量本身既不是这些映射中的任何一个而是一个更基本的几何对象。双向量用反对称的 2 指标张量或反对称矩阵来表示。事实上你可以从两个向量形成双向量。虽然你可以让它符合映射的图景但这开始让人觉得非常随意。 Some tensors are inherently linear maps, however, and all such maps can be written in terms of some basis as a matrix. Even the Riemann tensor, which has (n2)(n2)(n^2)(n^2)(n2)(n2) by (n2)(n2)(n^2)(n^2)(n2)(n2) components, can be written this way, even though it’s usually considered a map of two vectors to two vectors, three vectors to one vector, four vectors to a scalar…I could go on. 然而有些张量本质上是线性映射并且所有这些映射都可以根据某个基写成矩阵形式。即使是黎曼张量它有 (n2)(n2)×(n2)(n2)(n^2)(n^2) \times (n^2)(n^2)(n2)(n2)×(n2)(n2) 个分量也可以这样写尽管它通常被认为是将两个向量映射到两个向量、三个向量映射到一个向量、四个向量映射到一个标量……我可以继续列举下去。 But not all matrices represent information that is suitable for such geometric considerations. 但并非所有矩阵都代表适合这种几何考量的信息。 Your interpretation of bivectors as rank 2 antisymmetric tensors is very interesting. Where does it find its justification ? Could one construct something similar or analogous for “cobivectors”, outer products of 2 covectors, or 2-forms ? This could help provide interesting visualizations of 2-forms. 你将双向量解释为秩 2 的反对称张量这非常有趣。这种解释的依据是什么人们可以为“余双向量”、两个余向量的外积或 2 - 形式构造类似的东西吗这可能有助于提供 2 - 形式的有趣可视化。 The shortest answer I can come up with is that a Tensor is described by a matrix (or rank 1 vector) but also the type of thing represented. Matrices have no such “type” associated with them. If you misapply linear algebra on inconsistently typed matrices the math yields mathematically valid garbage. 我能想到的最简短的答案是张量由矩阵或秩 1 向量描述但还包含所表示事物的类型。矩阵没有与之相关联的这种“类型”。如果你将线性代数错误地应用于类型不一致的矩阵得到的结果在数学上是有效的但毫无意义。 Intuitively you can’t transform apples into peach pie. But you can transform apples into apple pie. Matrices have no intrinsic type associated with them so a linear algebra recipe to do the peach pie transform will produce garbage from the apples matrix. 直观地说你不能把苹果变成桃派。但你可以把苹果变成苹果派。矩阵没有内在的相关类型所以一个用于制作桃派的线性代数方法用在苹果矩阵上会产生毫无意义的结果。 A more mathematical example is that if you have a vector describing text terms in a document and a vector describing DNA codes, you cannot take the cosine of the normalized vectors (dot product) to see how “similar” they are. The dot product is mathematically valid but since they are from different types and represent different things the dot product is meaningless garbage. But if you do the same with 2 text term vectors you can make statements about how similar they are from the result, it is indeed not garbage. 一个更具数学性的例子是如果你有一个描述文档中文本术语的向量和一个描述 DNA 编码的向量你不能通过计算归一化向量的余弦点积来判断它们有多“相似”。点积在数学上是有效的但由于它们来自不同的类型并代表不同的事物这个点积是毫无意义的。但如果你对两个文本术语向量做同样的事情你可以从结果中判断它们的相似程度这确实是有意义的。 What I’m calling “type” is more rigorously defined but the above gets the gist I think. 我所说的“类型”有更严格的定义但我认为上面的内容抓住了要点。 All matrices are not tensors, although all tensors of rank 2 are matrices. 并非所有矩阵都是张量尽管所有秩为 2 的张量都是矩阵。 Example 示例 T[x−yx2−y2]T \begin{bmatrix}x -y \\ x^2 -y^2\end{bmatrix}T[xx2−y−y2] This matrix TTT is not tensor rank 2. We test matrix TTT to rotation matrix 这个矩阵 TTT 不是秩为 2 的张量。我们用旋转矩阵来测试矩阵 TTT A[cos⁡(θ)sin⁡(θ)−sin⁡(θ)cos⁡(θ)]A \begin{bmatrix}\cos(\theta) \sin(\theta) \\ -\sin(\theta) \cos(\theta)\end{bmatrix}A[cos(θ)−sin(θ)sin(θ)cos(θ)] Now, expand tensor equation rank 2, for example 现在展开秩为 2 的张量方程例如 T11′Σ(A1i∗A1j∗Tij)(1)T_{11} \Sigma (A_{1i} * A_{1j} * T_{ij}) \quad (1)T11′Σ(A1i∗A1j∗Tij)(1) Now, calculate 现在计算 T11′x′x∗cos⁡(θ)y∗sin⁡(θ)(2)T_{11} x x * \cos(\theta) y * \sin(\theta) \quad (2)T11′x′x∗cos(θ)y∗sin(θ)(2) You see (1) is unequal to (2), then we can conclude that the matrix TTT isn’t a tensor of rank 2. 你可以看到1与2不相等因此我们可以得出结论矩阵 TTT 不是秩为 2 的张量。 Tensor must follow the conversion(transformation) rules, but matrices generally are not. 张量必须遵循变换规则但矩阵通常不遵循。 Should you not sum over all indices in equation 2? So you would get xcos⁡2−y(cos⁡sin⁡)x2(cos⁡sin⁡)−y2sin⁡2x \cos^2 - y (\cos \sin) x^2 (\cos \sin) - y^2 \sin^2xcos2−y(cossin)x2(cossin)−y2sin2 (if I interpret your formula correctly, you sum over all iii and jjj, right?) 你不应该在方程 2 中对所有指标求和吗那么你会得到 xcos⁡2−y(cos⁡sin⁡)x2(cos⁡sin⁡)−y2sin⁡2x \cos^2 - y (\cos \sin) x^2 (\cos \sin) - y^2 \sin^2xcos2−y(cossin)x2(cossin)−y2sin2如果我对你的公式理解正确的话你要对所有的 iii 和 jjj 求和对吗 Also, would that not just show that it is not a rank (2,0) tensor, but not that it is not a rank 2 tensor in general (like 1,1 or 0,2)? 此外这难道不只是表明它不是2,0型张量而不是表明它一般不是秩为 2 的张量比如1,1型或0,2型吗 Why the sudden fascination with tensors? 为何人们突然对张量产生浓厚兴趣 edited Jul 4, 2016 at 5:30 I’ve noticed lately that a lot of people are developing tensor equivalents of many methods (tensor factorization, tensor kernels, tensors for topic modeling, etc) I’m wondering, why is the world suddenly fascinated with tensors? Are there recent papers/ standard results that are particularly surprising, that brought about this? Is it computationally a lot cheaper than previously suspected? 我最近注意到很多人正在开发许多方法的张量等价形式张量分解、张量核、用于主题建模的张量等。我在想为什么世界突然对张量如此着迷是否有最近的论文或标准结果特别令人惊讶从而引发了这种现象它在计算上是否比之前预想的便宜得多 I’m not being glib, I sincerely am interested, and if there are any pointers to papers about this, I’d love to read them. 我并非随口一问我是真心感兴趣如果有相关论文的参考资料我很乐意阅读。 It seems like the only retaining feature that “big data tensors” share with the usual mathematical definition is that they are multidimensional arrays. So I’d say that big data tensors are a marketable way of saying “multidimensional array,” because I highly doubt that machine learning people will care about either the symmetries or transformation laws that the usual tensors of mathematics and physics enjoy, especially their usefulness in forming coordinate free equations. “大数据张量”与通常的数学定义仅有的共同特征似乎是它们都是多维数组。因此我认为“大数据张量”是“多维数组”的一种市场化说法因为我非常怀疑机器学习领域的人会关心数学和物理学中常见张量所具有的对称性或变换规律尤其是它们在形成无坐标方程方面的作用。 —— Alex R. Commented Feb 23, 2016 at 19:00 AlexR. without invariance to transformations there are no tensors 亚历克斯·R. 没有变换不变性就不存在张量。 —— Aksakal Commented Feb 23, 2016 at 21:43 Putting on my mathematical hat I can say that there is no intrinsic symmetry to a mathematical tensor. Further, they are another way to say ‘multidimensional array’. One could vote for using the word tensor over using the phrase multidimensional array simply on grounds of simplicity. In particular if VVV is a nnn-dimensional vector space, one can identify V⊗VV \otimes VV⊗V with nnn by nnn matrices. 从数学的角度来说数学张量没有内在的对称性。此外它们是“多维数组”的另一种说法。仅仅出于简洁性人们可能会选择使用“张量”一词而非“多维数组”这一短语。特别是如果 VVV 是一个 nnn 维向量空间我们可以将 V⊗VV \otimes VV⊗V 等同于 n×nn \times nn×n 矩阵。 —— meh Commented Feb 24, 2016 at 14:47 aginensky If a tensor were nothing more than a multidimensional array, then why do the definitions of tensors found in math textbooks sound so complicated? From Wikipedia: “The numbers in the multidimensional array are known as the scalar components of the tensor… Just as the components of a vector change when we change the basis of the vector space, the components of a tensor also change under such a transformation. Each tensor comes equipped with a transformation law that details how the components of the tensor respond to a change of basis.” In math, a tensor is not just an array. aginensky 如果张量只不过是多维数组那么为什么数学教科书中的张量定义听起来如此复杂来自维基百科“多维数组中的数被称为张量的标量分量……正如当我们改变向量空间的基时向量的分量会发生变化一样张量的分量在这种变换下也会发生变化。每个张量都配备有一个变换规律详细说明张量的分量如何响应基的变化。” 在数学中张量不仅仅是一个数组。 —— littleO Commented Feb 25, 2016 at 0:55 Just some general thoughts on this discussion: I think that, as with vectors and matrices, the actual application often becomes a much-simplified instantiation of much richer theory. I am reading this paper in more depth: Tensor Decompositions and Applications and one thing that is really impressing me is that the “representational” tools for matrices (eigenvalue and singular value decompositions) have interesting generalizations in higher orders. I’m sure there are many more beautiful properties as well, beyond just a nice container for more indices. Tensor Decompositions and Applications Authors: Tamara G. Kolda and Brett W. BaderAuthors Info Affiliations https://doi.org/10.1137/07070111X 关于这场讨论的一些普遍想法我认为与向量和矩阵一样实际应用往往是更丰富理论的一种高度简化的实例。我正在更深入地阅读这篇论文张量分解和应用让我印象深刻的一点是矩阵的“表示性”工具特征值分解和奇异值分解在更高阶数上有有趣的推广。我确信除了作为容纳更多索引的良好容器之外它们还有许多更优美的性质。 —— Y. S. Commented Feb 25, 2016 at 15:40 Answers This is not an answer to your question, but an extended comment on the issue that has been raised here in comments by different people, namely: are machine learning “tensors” the same thing as tensors in mathematics? 这并非对你问题的回答而是对不同人在评论中提出的问题的延伸评论即机器学习中的“张量”与数学中的张量是同一事物吗 Now, according to the Cichoki 2014, Era of Big Data Processing: A New Approach via Tensor Networks and Tensor Decompositions, and Cichoki et al. 2014, Tensor Decompositions for Signal Processing Applications, 根据 Cichoki 2014 年的论文《大数据处理时代通过张量网络和张量分解的新方法》arXiv:1403.2048以及Cichoki等人2014年的论文《用于信号处理应用的张量分解》arXiv:1403.4462 A higher-order tensor can be interpreted as a multiway array, […] 高阶张量可以解释为多路数组[…] A tensor can be thought of as a multi-index numerical array, […] 张量可以被视为多索引数值数组[…] Tensors (i.e., multi-way arrays) […] 张量即多路数组[…] So in machine learning / data processing a tensor appears to be simply defined as a multidimensional numerical array. An example of such a 3D tensor would be 1000 video frames of 640×480 size. A usual n×pn \times pn×p data matrix is an example of a 2D tensor according to this definition. 因此在机器学习/数据处理中“张量”似乎被简单定义为多维数值数组。这种三维张量的一个例子是1000个640×480大小的视频帧。根据这个定义通常的 n×pn \times pn×p 数据矩阵是二维张量的一个例子。 This is not how tensors are defined in mathematics and physics! 这并非数学和物理学中张量的定义方式 A tensor can be defined as a multidimensional array obeying certain transformation laws under the change of coordinates (see Wikipedia or the first sentence in MathWorld article). A better but equivalent definition (see Wikipedia) says that a tensor on vector space VVV is an element of V⊗…⊗V∗V \otimes \ldots \otimes V^*V⊗…⊗V∗. Note that this means that, when represented as multidimensional arrays, tensors are of size p×pp \times pp×p or p×p×pp \times p \times pp×p×p etc., where ppp is the dimensionality of VVV. 张量可以定义为在坐标变换下遵循特定变换规律的多维数组参见维基百科或 MathWorld文章的第一句。一个更好但等价的定义参见维基百科指出向量空间 VVV 上的张量是 V⊗…⊗V∗V \otimes \ldots \otimes V^*V⊗…⊗V∗ 中的一个元素。请注意这意味着当表示为多维数组时张量的大小为 p×pp \times pp×p 或 p×p×pp \times p \times pp×p×p 等其中 ppp 是 VVV 的维数。 All tensors well-known in physics are like that: inertia tensor in mechanics is 3×33 \times 33×3, electromagnetic tensor in special relativity is 4×44 \times 44×4, Riemann curvature tensor in general relativity is 4×4×4×44 \times 4 \times 4 \times 44×4×4×4. Curvature and electromagnetic tensors are actually tensor fields, which are sections of tensor bundles (see [e.g. here](https://books.google.pt/books?id2ydvda4F1VEClpgPA163ots-ZYnjDcEpEdqtensor bundlepgPA163#vonepageqtensor bundleffalse) but it gets technical), but all of that is defined over a vector space VVV. 物理学中所有著名的张量都是如此力学中的惯性张量是 3×33 \times 33×3 的狭义相对论中的电磁张量是 4×44 \times 44×4 的广义相对论中的黎曼曲率张量是 4×4×4×44 \times 4 \times 4 \times 44×4×4×4 的。曲率张量和电磁张量实际上是张量场它们是张量丛的截面例如参见[这里](https://books.google.pt/books?id2ydvda4F1VEClpgPA163ots-ZYnjDcEpEdqtensor bundlepgPA163#vonepageqtensor bundleffalse)但内容较为专业但所有这些都是在向量空间 VVV 上定义的。 Of course one can construct a tensor product V⊗WV \otimes WV⊗W of an ppp-dimensional VVV and qqq-dimensional WWW but its elements are usually not called “tensors”, as stated e.g. here on Wikipedia: 当然我们可以构造 ppp 维空间 VVV 和 qqq 维空间 WWW 的张量积 V⊗WV \otimes WV⊗W但正如维基百科中所述其元素通常不被称为“张量” In principle, one could define a “tensor” simply to be an element of any tensor product. However, the mathematics literature usually reserves the term tensor for an element of a tensor product of a single vector space VVV and its dual, as above. 原则上人们可以简单地将“张量”定义为任何张量积中的元素。然而数学文献通常将“张量”一词保留用于单个向量空间 VVV 与其对偶空间的张量积中的元素如上所述。 One example of a real tensor in statistics would be a covariance matrix. It is p×pp \times pp×p and transforms in a particular way when the coordinate system in the ppp-dimensional feature space VVV is changed. It is a tensor. But a n×pn \times pn×p data matrix XXX is not. 统计学中一个真正的张量例子是协方差矩阵。它是 p×pp \times pp×p 的当 ppp 维特征空间 VVV 中的坐标系发生变化时它会以特定方式变换。它是一个张量。但 n×pn \times pn×p 的数据矩阵 XXX 不是。 But can we at least think of XXX as an element of tensor product W⊗VW \otimes VW⊗V, where WWW is nnn-dimensional and VVV is ppp-dimensional? For concreteness, let rows in XXX correspond to people (subjects) and columns to some measurements (features). A change of coordinates in VVV corresponds to linear transformation of features, and this is done in statistics all the time (think of PCA). But a change of coordinates in WWW does not seem to correspond to anything meaningful (and I urge anybody who has a counter-example to let me know in the comments). So it does not seem that there is anything gained by considering XXX as an element of W⊗VW \otimes VW⊗V. 但我们至少能将 XXX 视为张量积 W⊗VW \otimes VW⊗V 中的一个元素吗其中 WWW 是 nnn 维的VVV 是 ppp 维的。具体来说设 XXX 中的行对应于人员对象列对应于某些测量值特征。VVV 中的坐标变换对应于特征的线性变换这在统计学中很常见例如主成分分析。但 WWW 中的坐标变换似乎不对应任何有意义的东西我恳请任何有反例的人在评论中告诉我。因此将 XXX 视为 W⊗VW \otimes VW⊗V 中的元素似乎没有任何意义。 And indeed, the common notation is to write X∈Rn×pX \in \mathbb{R}^{n \times p}X∈Rn×p, where Rn×p\mathbb{R}^{n \times p}Rn×p is a set of all n×pn \times pn×p matrices (which, by the way, are defined as rectangular arrays of numbers, without any assumed transformation properties). 事实上常见的记法是 X∈Rn×pX \in \mathbb{R}^{n \times p}X∈Rn×p其中 Rn×p\mathbb{R}^{n \times p}Rn×p 是所有 n×pn \times pn×p 矩阵的集合顺便说一下矩阵被定义为数字的矩形数组没有任何假定的变换性质。 My conclusion is: (a) machine learning tensors are not math/physics tensors, and (b) it is mostly not useful to see them as elements of tensor products either. 我的结论是a机器学习中的张量不是数学/物理学中的张量b将它们视为张量积的元素通常也没有意义。 Instead, they are multidimensional generalizations of matrices. Unfortunately, there is no established mathematical term for that, so it seems that this new meaning of “tensor” is now here to stay. 相反它们是矩阵的多维推广。不幸的是对此没有既定的数学术语因此“张量”的这种新含义似乎将保留下来。 edited Sep 3, 2016 at 23:55 amoeba I am a pure mathematician, and this is a very good answer. In particular, the example of a covariance matrix is an excellent way to understand the “transformation properties” or “symmetries” that seemed to cause confusion above. If you change coordinates on your ppp-dimensional feature space, the covariance matrix transforms in a particular and possibly surprising way; if you did the more naive transformation on your covariances you would end up with incorrect results. 我是一名纯数学家这是一个非常好的回答。特别是协方差矩阵的例子是理解上述似乎引起混淆的“变换性质”或“对称性”的绝佳方式。如果你在 ppp 维特征空间中改变坐标协方差矩阵会以一种特定且可能令人惊讶的方式变换如果你对协方差进行更简单的变换最终会得到错误的结果。 —— Tom Church Commented Feb 25, 2016 at 4:08 Thanks, Tom, I appreciate that you registered on CrossValidated to leave this comment. It has been a long time since I was studying differential geometry so I am happy if somebody confirms what I wrote. It is a pity that there is no established term in mathematics for “multidimensional matrices”; it seems that “tensor” is going to stick in machine learning community as a term for that. How do you think one should rather call it though? The best thing that comes to my mind is nnn-matrices (e.g. 3-matrix to refer to a video object), somewhat analogously to nnn-categories. 谢谢汤姆感谢你注册CrossValidated来留下这条评论。我学习微分几何已经很久了所以如果有人能证实我所写的内容我会很高兴。遗憾的是数学中没有“多维矩阵”的既定术语似乎“张量”一词将在机器学习社区中作为其术语保留下来。不过你认为应该称它为什么呢我能想到的最好的是 nnn-矩阵例如用3-矩阵指代视频对象这在某种程度上类似于 nnn-范畴。 —— amoeba Commented Feb 25, 2016 at 10:25 amoeba, in programming the multidemensional matrices are usually called arrays, but some languages such as MATLAB would call them matrices. For instance, in FORTRAN the arrays can have more than 2 dimensions. In languages like C/C/Java the arrays are one dimensional, but you can have arrays of arrays, making them work like multidimensional arrays too. MATLAB supports 3 or more dimensional arrays in the syntax. 阿米巴在编程中多维矩阵通常被称为“数组”arrays但有些语言如MATLAB会称它们为“矩阵”matrices。例如在FORTRAN中数组可以有超过2个维度。在C/C/Java等语言中数组是一维的但你可以有数组的数组使它们也能像多维数组一样工作。MATLAB在语法上支持3维或更高维的数组。 —— Aksakal Commented Feb 25, 2016 at 15:17 That is very interesting. I hope you will emphasize that point. But please take some care not to confuse a set with a vector space it determines, because the distinction is important in statistics. In particular (to pick up one of your examples), although a linear combination of people is meaningless, a linear combination of real-valued functions on a set of people is both meaningful and important. It’s the key to solving linear regression, for instance. 这非常有趣。我希望你能强调这一点。但请注意不要将一个集合与其所确定的向量空间混淆因为这种区别在统计学中很重要。特别是以你的一个例子为例虽然人的线性组合是没有意义的但一组人上的实值函数的线性组合既有意义又很重要。例如这是解决线性回归的关键。 —— whuber♦ Commented Jul 5, 2016 at 13:56 Per T. Kolda, B, Bada,“Tensor Decompositions and Applications” SIAM Review 2009, epubs.siam.org/doi/pdf/10.1137/07070111X A tensor is a multidimensional array. More formally, an N-way or Nth-order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. This notion of tensors is not to be confused with tensors in physics and engineering (such as stress tensors), which are generally referred to as tensor fields in mathematics 根据T. Kolda、B. Bada在《张量分解及其应用》《SIAM评论》2009年epubs.siam.org/doi/pdf/10.1137/07070111X中的说法“张量是一个多维数组。更正式地说N阶张量是N个向量空间的张量积中的一个元素每个向量空间都有自己的坐标系。这种张量的概念不应与物理学和工程学中的张量如应力张量相混淆后者在数学中通常被称为张量场。” —— Mark L. Stone Commented Sep 3, 2016 at 20:31 Tensors often offer more natural representations of data, e.g., consider video, which consists of obviously correlated images over time. You can turn this into a matrix, but it’s just not natural or intuitive (what does a factorization of some matrix-representation of video mean?). 张量通常能更自然地表示数据例如考虑视频它由随时间变化的、明显相关的图像组成。你可以将其转换为矩阵但这并不自然或直观视频的某种矩阵表示的分解意味着什么。 Tensors are trending for several reasons: 张量之所以流行有几个原因 our understanding of multilinear algebra is improving rapidly, specifically in various types of factorizations, which in turn helps us to identify new potential applications (e.g., multiway component analysis) 我们对多线性代数的理解正在迅速提高特别是在各种类型的分解方面这反过来帮助我们发现新的潜在应用例如多路成分分析 software tools are emerging (e.g., Tensorlab) and are being welcomed 软件工具正在涌现例如Tensorlab并受到欢迎 Big Data applications can often be solved using tensors, for example recommender systems, and Big Data itself is hot 大数据应用通常可以使用张量来解决例如推荐系统而大数据本身很热门 increases in computational power, as some tensor operations can be hefty (this is also one of the major reasons why deep learning is so popular now) 计算能力的提升因为一些张量运算可能很庞大这也是深度学习现在如此流行的主要原因之一 edited Feb 23, 2016 at 9:55 Marc Claesen On the computational power part: I think the most important is that linear algebra can be very fast on GPUs, and lately they have gotten bigger and faster memories, that is the biggest limitation when processing large data. 关于计算能力部分我认为最重要的是线性代数运算在GPU上可以非常快而且最近GPU的内存变得更大、速度更快而内存是处理大数据时最大的限制。 —— Davidmh Commented Feb 23, 2016 at 12:17 Marc Claesen’s answer is a good one. David Dunson, Distinguished Professor of Statistics at Duke, has been one of the key exponents of tensor-based approaches to modeling as in this presentation, Bayesian Tensor Regression. icerm.brown.edu/materials/Slides/sp-f12-w1/… 马克·克拉森的回答很好。杜克大学统计学杰出教授戴维·邓森一直是基于张量的建模方法的主要倡导者如本次演讲《贝叶斯张量回归》中所述。icerm.brown.edu/materials/Slides/sp-f12-w1/… —— user78229 Commented Feb 23, 2016 at 14:36 As mentioned by David, Tensor algorithms often lend themselves well to parallelism, which hardware (such as GPU accelerators) are increasingly getting better at. 正如戴维所提到的张量算法通常非常适合并行处理而硬件如GPU加速器在这方面的能力正日益增强。 —— Thomas Russell Commented Feb 23, 2016 at 15:15 I assumed that the better memory/CPU capabilities were playing a part, but the very recent burst of attention was interesting; I think it must be because of a lot of recent surprising successes with recommender systems, and perhaps also kernels for SVMs, etc. Thanks for the links! great places to start learning about this stuff… 我认为更好的内存/CPU能力起到了一定作用但最近突然受到关注很有趣我想这一定是因为最近在推荐系统方面取得了很多令人惊讶的成功或许还有支持向量机的核函数等。感谢这些链接是学习这些东西的好起点…… —— Y. S. Commented Feb 24, 2016 at 7:20 If you store a video as a multidimensional array, I don’t see how this multidimensional array would have any of the invariance properties a tensor is supposed to have. It doesn’t seem like the word “tensor” is appropriate in this example. 如果你将视频存储为多维数组我看不出这个多维数组会具有张量应有的任何不变性。在这个例子中“张量”一词似乎并不合适。 —— littleO Commented Feb 24, 2016 at 8:45 I think your question should be matched with an answer that is equally free flowing and open minded as the question itself. So, here they are my two analogies. 我认为你的问题应该得到一个与问题本身一样自由流畅、思想开放的回答。因此我有两个类比。 First, unless you’re a pure mathematician, you were probably taught univariate probabilities and statistics first. For instance, most likely your first OLS example was probably on a model like this: 首先除非你是纯数学家否则你可能首先学习的是单变量概率和统计。例如你接触的第一个普通最小二乘法…OLS例子很可能是这样的模型 yiabxieiy_i a b x_i e_iyiabxiei Most likely, you went through deriving the estimates through actually minimizing the sum of least squares: 很可能你通过实际最小化残差平方和来推导估计值 TSS∑i(yi−aˉ−bˉxi)2TSS \sum_i (y_i - \bar{a} - \bar{b} x_i)^2TSSi∑(yi−aˉ−bˉxi)2 Then you write the FOCs for parameters and get the solution: 然后你写出参数的一阶条件…FOC并得到解 ∂TTS∂aˉ0\frac{\partial TTS}{\partial \bar{a}} 0∂aˉ∂TTS0 Then later you’re told that there’s an easier way of doing this with vector (matrix) notation: 之后你会被告知用向量矩阵符号有更简单的方法 yXbey X b eyXbe and the TTS becomes: 此时TTS变为 TTS(y−Xbˉ)′(y−Xbˉ)TTS (y - X \bar{b}) (y - X \bar{b})TTS(y−Xbˉ)′(y−Xbˉ) The FOCs are: 一阶条件为 2X′(y−Xbˉ)02 X (y - X \bar{b}) 02X′(y−Xbˉ)0 And the solution is 解为 bˉ(X′X)−1X′y\bar{b} (X X)^{-1} X ybˉ(X′X)−1X′y If you’re good at linear algebra, you’ll stick to the second approach once you’ve learned it, because it’s actually easier than writing down all the sums in the first approach, especially once you get into multivariate statistics. 如果你擅长线性代数一旦学会第二种方法你就会坚持使用它因为它实际上比第一种方法中写下所有求和式更容易尤其是当你进入多元统计领域时。 Hence my analogy is that moving to tensors from matrices is similar to moving from vectors to matrices: if you know tensors some things will look easier this way. 因此我的类比是从矩阵过渡到张量类似于从向量过渡到矩阵如果你了解张量有些事情用这种方式会显得更简单。 Second, where do the tensors come from? I’m not sure about the whole history of this thing, but I learned them in theoretical mechanics. Certainly, we had a course on tensors, but I didn’t understand what was the deal with all these fancy ways to swap indices in that math course. It all started to make sense in the context of studying tension forces. 其次张量来自哪里我不确定它的整个历史但我是在理论力学中学习张量的。当然我们有一门张量课程但在那门数学课上我不明白所有这些花哨的索引交换方式有什么意义。在研究张力的背景下这一切才开始变得有意义。 So, in physics they also start with a simple example of pressure defined as force per unit area, hence: 因此在物理学中他们也从一个简单的例子开始即压力定义为单位面积上的力因此 Fp⋅dSF p \cdot dSFp⋅dS This means you can calculate the force vector FFF by multiplying the pressure ppp (scalar) by the unit of area dSdSdS (normal vector). That is when we have only one infinite plane surface. In this case there’s just one perpendicular force. A large balloon would be good example. 这意味着你可以通过将压力 ppp标量乘以面积单位 dSdSdS法向量来计算力向量 FFF。这是当我们只有一个无限大平面时的情况。在这种情况下只有一个垂直力。一个大气球就是一个很好的例子。 However, if you’re studying tension inside materials, you are dealing with all possible directions and surfaces. In this case you have forces on any given surface pulling or pushing in all directions, not only perpendicular ones. Some surfaces are torn apart by tangential forces “sideways” etc. So, your equation becomes: 然而如果你在研究材料内部的张力你会涉及所有可能的方向和表面。在这种情况下任何给定表面上的力都会在所有方向上拉或推而不仅仅是垂直方向。有些表面会被“侧向”的切向力撕裂等。因此你的方程变为 FP⋅dSF P \cdot dSFP⋅dS The force is still a vector FFF and the surface area is still represented by its normal vector dSdSdS, but PPP is a tensor now, not a scalar. 力仍然是向量 FFF表面积仍然由其法向量 dSdSdS 表示但 PPP 现在是一个张量而不是标量。 Ok, a scalar and a vector are also tensors 好吧标量和向量也是张量 Another place where tensors show up naturally is co-variance or correlation matrices. Just think of this: how to transform one correlation matrix C0C_0C0 to another one C1C_1C1? You realize we can’t just do it this way: 张量自然出现的另一个地方是协方差矩阵或相关矩阵。想想看如何将一个相关矩阵 C0C_0C0 变换为另一个相关矩阵 C1C_1C1你会意识到我们不能这样做 Cθ(i,j)C0(i,j)θ(C1(i,j)−C0(i,j)),C_\theta (i,j) C_0 (i,j) \theta (C_1 (i,j) - C_0 (i,j)),Cθ(i,j)C0(i,j)θ(C1(i,j)−C0(i,j)), where θ∈[0,1]\theta \in [0,1]θ∈[0,1] because we need to keep all CθC_\thetaCθ positive semi-definite. 其中 θ∈[0,1]\theta \in [0,1]θ∈[0,1]因为我们需要保持所有 CθC_\thetaCθ 都是半正定的。 So, we’d have to find the path δCθ\delta C_\thetaδCθ such that C1C0∫θδCθC_1 C_0 \int_\theta \delta C_\thetaC1C0∫θδCθ, where δCθ\delta C_\thetaδCθ is a small disturbance to a matrix. There are many different paths, and we could search for the shortest ones. That’s how we get into Riemannian geometry, manifolds, and… tensors. 因此我们必须找到路径 δCθ\delta C_\thetaδCθ使得 C1C0∫θδCθC_1 C_0 \int_\theta \delta C_\thetaC1C0∫θδCθ其中 δCθ\delta C_\thetaδCθ 是对矩阵的一个小扰动。有许多不同的路径我们可以寻找最短的路径。这就是我们进入黎曼几何、流形以及……张量的原因。 UPDATE: what’s tensor, anyway? 更新到底什么是张量 amoeba and others got into a lively discussion of the meaning of tensor and whether it’s the same as an array. So, I thought an example is in order. 阿米巴和其他人就张量的含义以及它是否与数组相同展开了热烈的讨论。因此我认为有必要举一个例子。 Say, we go to a bazaar to buy groceries, and there are two merchant dudes, d1d_1d1 and d2d_2d2. We noticed that if we pay x1x_1x1 dollars to d1d_1d1 and x2x_2x2 dollars to d2d_2d2 then d1d_1d1 sells us y12x1−x2y_1 2 x_1 - x_2y12x1−x2 pounds of apples, and d2d_2d2 sells us y2−0.5x12x2y_2 -0.5 x_1 2 x_2y2−0.5x12x2 pounds of oranges. For instance, if we pay both 1 dollar, i.e. x1x21x_1 x_2 1x1x21 then we must get 1 pound of apples and 1.5 of oranges. 假设我们去集市买杂货有两个商人d1d_1d1 和 d2d_2d2。我们“注意到”如果我们付给 d1d_1d1 x1x_1x1 美元付给 d2d_2d2 x2x_2x2 美元那么 d1d_1d1 会卖给我们 y12x1−x2y_1 2 x_1 - x_2y12x1−x2 磅苹果d2d_2d2 会卖给我们 y2−0.5x12x2y_2 -0.5 x_1 2 x_2y2−0.5x12x2 磅橙子。例如如果我们各付1美元即 x1x21x_1 x_2 1x1x21那么我们会得到1磅苹果和1.5磅橙子。 We can express this relation in the form of a matrix PPP: 我们可以用矩阵 PPP 的形式表示这种关系 2 -1 -0.5 2Then the merchants produce this much apples and oranges if we pay them xxx dollars: 如果我们付给商人 xxx 美元他们会提供这么多苹果和橙子 yPxy P xyPx This works exactly like a matrix by vector multiplication. 这完全像矩阵与向量的乘法。 Now, let’s say instead of buying the goods from these merchants separately, we declare that there are two spending bundles we utilize. We either pay both 0.71 dollars, or we pay d1d_1d1 0.71 dollars and demand 0.71 dollars from d2d_2d2 back. Like in the initial case, we go to a bazaar and spend z1z_1z1 on the bundle one and z2z_2z2 on the bundle 2. 现在假设我们不是分别从这些商人那里购买商品而是声明我们使用两种消费组合。我们要么各付0.71美元要么付给 d1d_1d1 0.71美元并要求 d2d_2d2 返还0.71美元。和最初的情况一样我们去集市在组合1上花费 z1z_1z1在组合2上花费 z2z_2z2。 So, let’s look at an example where we spend just z12z_1 2z12 on bundle 1. In this case, the first merchant gets x11x_1 1x11 dollars, and the second merchant gets the same x21x_2 1x21. Hence, we must get the same amounts of produce like in the example above, aren’t we? 那么让我们看一个例子我们只在组合1上花费 z12z_1 2z12。在这种情况下第一个商人得到 x11x_1 1x11 美元第二个商人也得到 x21x_2 1x21 美元。因此我们必须得到与上面例子中相同数量的农产品不是吗 Maybe, maybe not. You noticed that PPP matrix is not diagonal. This indicates that for some reason how much one merchant charges for his produce depends also on how much we paid the other merchant. They must get an idea of how much pay them, maybe through rumors? In this case, if we start buying in bundles they’ll know for sure how much we pay each of them, because we declare our bundles to the bazaar. In this case, how do we know that the PPP matrix should stay the same? 可能是也可能不是。你注意到 PPP 矩阵不是对角矩阵。这表明由于某种原因一个商人对其商品的收费还取决于我们付给另一个商人的钱数。他们一定知道我们付了多少也许是通过谣言在这种情况下如果我们开始按组合购买他们肯定会知道我们付给每个人多少钱因为我们向集市声明了我们的组合。在这种情况下我们怎么知道 PPP 矩阵应该保持不变呢 Maybe with full information of our payments on the market the pricing formulas would change too! This will change our matrix PPP, and there’s no way to say how exactly. 也许有了我们在市场上付款的完整信息定价公式也会改变这将改变我们的矩阵 PPP而且无法确切说明会如何改变。 This is where we enter tensors. Essentially, with tensors we say that the calculations do not change when we start trading in bundles instead of directly with each merchant. That’s the constraint, that will impose transformation rules on PPP, which we’ll call a tensor. 这就是我们进入张量领域的地方。本质上对于张量我们说当我们开始按组合交易而不是直接与每个商人交易时计算结果不会改变。这是一种约束它将对 PPP 施加变换规则我们将 PPP 称为张量。 Particularly we may notice that we have an orthonormal basis dˉ1,dˉ2\bar{d}_1, \bar{d}_2dˉ1,dˉ2, where did_idi means a payment of 1 dollar to a merchant iii and nothing to the other. We may also notice that the bundles also form an orthonormal basis dˉ1′,dˉ2′\bar{d}_1, \bar{d}_2dˉ1′,dˉ2′, which is also a simple rotation of the first basis by 45 degrees counterclockwise. It’s also a PC decomposition of the first basis. hence, we are saying that switching to the bundles is simple a change of coordinates, and it should not change the calculations. Note, that this is an outside constraint that we imposed on the model. It didn’t come from pure math properties of matrices. 特别是我们可能会注意到我们有一个标准正交基 dˉ1,dˉ2\bar{d}_1, \bar{d}_2dˉ1,dˉ2其中 did_idi 表示付给商人 iii 1美元而不付给另一个人。我们可能还会注意到这些组合也形成了一个标准正交基 dˉ1′,dˉ2′\bar{d}_1, \bar{d}_2dˉ1′,dˉ2′它也是第一个基逆时针旋转45度的简单旋转。它也是第一个基的主成分分解。因此我们说切换到组合只是坐标的改变不应改变计算结果。请注意这是我们施加给模型的外部约束它并非来自矩阵的纯数学性质。 Now, our shopping can be expressed as a vector xx1dˉ1x2dˉ2x x_1 \bar{d}_1 x_2 \bar{d}_2xx1dˉ1x2dˉ2. The vectors are tensors too, BTW. The tensor is interesting: it can be represented as 现在我们的购物可以表示为向量 xx1dˉ1x2dˉ2x x_1 \bar{d}_1 x_2 \bar{d}_2xx1dˉ1x2dˉ2。顺便说一下向量也是张量。这个张量很有趣它可以表示为 P∑ijpijdˉidˉjP \sum_{i j} p_{i j} \bar{d}_i \bar{d}_jPij∑pijdˉidˉj , and the groceries as yy1dˉ1y2dˉ2y y_1 \bar{d}_1 y_2 \bar{d}_2yy1dˉ1y2dˉ2. With groceries yiy_iyi means pound of produce from the merchant iii, not the dollars paid. 而杂货表示为 yy1dˉ1y2dˉ2y y_1 \bar{d}_1 y_2 \bar{d}_2yy1dˉ1y2dˉ2。对于杂货yiy_iyi 表示从商人 iii 那里得到的农产品磅数而不是支付的美元数。 Now, when we changed the coordinates to bundles the tensor equation stays the same: 现在当我们将坐标改为组合时张量方程保持不变 yPzy P zyPz That’s nice, but the payment vectors are now in the different basis: 这很好但支付向量现在处于不同的基下 zz1dˉ1′z2dˉ2′z z_1 \bar{d}_1 z_2 \bar{d}_2zz1dˉ1′z2dˉ2′ , while we may keep the produce vectors in the old basis yy1dˉ1y2dˉ2y y_1 \bar{d}_1 y_2 \bar{d}_2yy1dˉ1y2dˉ2. The tensor changes too: 而我们可以将农产品向量保持在旧基下 yy1dˉ1y2dˉ2y y_1 \bar{d}_1 y_2 \bar{d}_2yy1dˉ1y2dˉ2。张量也会变化 P∑ijpij′dˉi′dˉj′P \sum_{i j} p_{i j} \bar{d}_i \bar{d}_jPij∑pij′dˉi′dˉj′ . It’s easy to derive how the tensor must be transformed, it’s going to be PAP APA, where the rotation matrix is defined as dˉ′Adˉ\bar{d} A \bar{d}dˉ′Adˉ. In our case it’s the coefficient of the bundle. 很容易推导出张量必须如何变换它将是 PAP APA其中旋转矩阵定义为 dˉ′Adˉ\bar{d} A \bar{d}dˉ′Adˉ。在我们的例子中它是组合的系数。 We can work out the formulas for tensor transformation, and they’ll yield the same result as in the examples with x1x21x_1 x_2 1x1x21 and z10.71,z20z_1 0.71, z_2 0z10.71,z20. 我们可以推导出张量变换的公式它们将得到与 x1x21x_1 x_2 1x1x21 和 z10.71,z20z_1 0.71, z_2 0z10.71,z20 的例子相同的结果。 edited Dec 13, 2019 at 7:02 I got confused around here: So, lets look at an example where we spend just z11.42 on bundle 1. In this case, the first merchant gets x11 dollars, and the second merchant gets the same x21. Earlier you say that first bundle is that we pay both 0.71 dollars. So spending 1.42 on the first bundle should get 0.71 each and not 1, no? 我在这里有点困惑“那么让我们看一个例子我们只在组合1上花费z11.42。在这种情况下第一个商人得到x11美元第二个商人也得到x21美元。” 你之前说第一个组合是我们“各付0.71美元”。所以在第一个组合上花费1.42应该各得到0.71而不是1不是吗 —— amoeba Commented Feb 25, 2016 at 10:32 ameba, the idea’s that a bundle 1 is dˉ1/2dˉ2/2\bar{d}_1 / \sqrt{2} \bar{d}_2 / \sqrt{2}dˉ1/2dˉ2/2, so with 2\sqrt{2}2 bundle 1 you get dˉ1dˉ2\bar{d}_1 \bar{d}_2dˉ1dˉ2, i.e. 1$ each 阿米巴这个想法是组合1是 dˉ1/2dˉ2/2\bar{d}_1 / \sqrt{2} \bar{d}_2 / \sqrt{2}dˉ1/2dˉ2/2所以用 2\sqrt{2}2 个组合1你会得到 dˉ1dˉ2\bar{d}_1 \bar{d}_2dˉ1dˉ2即各1美元。 —— Aksakal Commented Feb 25, 2016 at 14:44 Aksakal, I know this discussion is quite old, but I don’t get that either (although I was really trying to). Where does that idea that a bundle 1 is dˉ1/2dˉ2/2\bar{d}_1 / \sqrt{2} \bar{d}_2 / \sqrt{2}dˉ1/2dˉ2/2 come from? Could you elaborate? How is that when you pay 1.42 for the bundle both merchants get 1? 阿克萨卡尔我知道这个讨论已经很久了但我也不明白尽管我真的很努力去理解。组合1是 dˉ1/2dˉ2/2\bar{d}_1 / \sqrt{2} \bar{d}_2 / \sqrt{2}dˉ1/2dˉ2/2 这个想法来自哪里你能详细说明一下吗为什么当你为这个组合支付1.42时两个商人各得到1美元 —— Matek Commented Sep 14, 2016 at 8:16 Aksakal This is great, thanks! I think you have a typo on the very last line, where you say x1 x2 1 (correct) and z1 0.71, z2 0. Presuming I understood everything correctly, z1 should be 1.42 (or 1.41, which is slightly closer to 2^0.5). 阿克萨卡尔这很棒谢谢我认为你在最后一行有一个笔误你说x1 x2 1正确z1 0.71z2 0。假设我理解正确的话z1应该是1.42或者1.41更接近2^0.5。 —— Mike Williamson Commented Aug 3, 2017 at 2:14 Aksakal from “Now, our shopping can be expressed as a vector…” your post is a bit unclear. Would you mind explaining it a bit more detail please? What’s the PPP notation you’re using? Could this simply be explained in terms of change of basis? With your notation, it seems like AAA would be the matrix for change of basis from d′dd′ to ddd, correct? I.e. to get the “transformation” w.r.t bundle basis, we compose the transformation w.r.t. starting basis together with the change of basis matrix, correct? 阿克萨卡尔从“现在我们的购物可以表示为一个向量……”开始你的帖子有点不清楚。你介意更详细地解释一下吗你使用的 PPP 符号是什么意思这能简单地用基的变换来解释吗根据你的符号似乎 AAA 是从 d′dd′ 到 ddd 的基变换矩阵对吗也就是说为了得到相对于组合基的“变换”我们将相对于初始基的变换与基变换矩阵组合起来对吗 —— Jake1234 Commented Sep 26, 2020 at 21:12 As someone who studies and builds neural networks and has repeatedly asked this question, I’ve come to the conclusion that we borrow useful aspects of tensor notation simply because they make derivation a lot easier and keep our gradients in their native shapes. The tensor chain rule is one of the most elegant derivation tools I have ever seen. Further tensor notations encourage computationally efficient simplifications that are simply nightmarish to find when using common extended versions of vector calculus. 作为研究和构建神经网络并反复问这个问题的人我得出的结论是我们借鉴张量符号的有用方面仅仅是因为它们使推导变得容易得多并使我们的梯度保持其原生形状。张量链式法则是我见过的最优雅的推导工具之一。此外张量符号有助于实现计算上高效的简化而这些简化在使用常见的向量微积分扩展版本时是难以实现的。 In Vector/Matrix calculus for instance there are 4 types of matrix products (Hadamard, Kronecker, Ordinary, and Elementwise) but in tensor calculus there is only one type of multiplication yet it covers all matrix multiplications and more. If you want to be generous, interpret tensor to mean multi-dimensional array that we intend to use tensor based calculus to find derivatives for, not that the objects we are manipulating are tensors*. 例如在向量/矩阵微积分中有4种矩阵乘积哈达玛积、克罗内克积、普通积和元素积但在张量微积分*中只有一种乘法却涵盖了所有矩阵乘法甚至更多。如果你想宽容一点可以将张量解释为我们打算使用基于张量的微积分来求导的多维数组而不是说我们正在操作的对象是张量。 In all honesty we probably call our multi-dimensional arrays tensors because most machine learning experts don’t care that much about adhering to the definitions of high level math or physics. The reality is we are just borrowing well developed Einstein Summation Conventions and Calculi which are typically used when describing tensors and don’t want to say Einstein summation convention based calculus over and over again. Maybe one day we might develop a new set of notations and conventions that steal only what they need from tensor calculus specifically for analyzing neural networks, but as a young field that takes time. 老实说我们可能将多维数组称为张量是因为大多数机器学习专家并不太在意是否遵循高等数学或物理学的定义。事实上我们只是在借用发展完善的爱因斯坦求和约定和微积分这些通常在描述张量时使用而不想一遍又一遍地说基于爱因斯坦求和约定的微积分。也许有一天我们会开发出一套新的符号和约定专门从张量微积分中提取分析神经网络所需的部分但作为一个年轻的领域这需要时间。 edited Jul 5, 2017 at 22:10 – gung - Reinstate Monica Commented Jul 4, 2017 at 1:17 Now I actually agree with most of the content of the other answers. But I’m going to play Devil’s advocate on one point. Again, it will be free flowing, so apologies… 现在我实际上同意其他回答的大部分内容。但在一点上我要唱反调。再次说明这将是自由发挥的所以先致歉…… Google announced a program called Tensor Flow for deep learning. This made me wonder what was ‘tensor’ about deep learning, as I couldn’t make the connection to the definitions I’d seen. 谷歌宣布了一个名为Tensor Flow张量流的深度学习程序。这让我想知道深度学习中的“张量”是什么因为我无法将其与我所看到的定义联系起来。 Deep learning models are all about transformation of elements from one space to another. E.g. if we consider two layers of some network you might write co-ordinate iii of a transformed variable yyy as a nonlinear function of the previous layer, using the fancy summation notation: 深度学习模型全是关于元素从一个空间到另一个空间的变换。例如如果我们考虑某个网络的两层你可以使用复杂的求和符号将变换变量 yyy 的坐标 iii 写成前一层的非线性函数 yiσ(βjixj)y_i \sigma (\beta_{j i} x_j)yiσ(βjixj) Now the idea is to chain together a bunch of such transformations in order to arrive at a useful representation of the original co-ordinates. So, for example, after the last transformation of an image a simple logistic regression will produce excellent classification accuracy; whereas on the raw image it would definitely not. 现在的想法是将一系列这样的变换串联起来以得到原始坐标的有用表示。例如在对图像进行最后一次变换后简单的逻辑回归会产生出色的分类精度而在原始图像上则肯定不会。 Now, the thing that seems to have been lost from sight is the invariance properties sought in a proper tensor. Particularly when the dimensions of transformed variables may be different from layer to layer. [E.g. some of the stuff I’ve seen on tensors makes no sense for non square Jacobians - I may be lacking some methods] 现在似乎被忽略的是真正张量所追求的不变性。特别是当变换变量的维度可能在层与层之间不同时。例如我所看到的一些关于张量的内容对于非平方雅可比矩阵来说是没有意义的——我可能缺乏一些方法 What has been retained is the notion of transformations of variables, and that certain representations of a vector may be more useful than others for particular tasks. Analogy being whether it makes more sense to tackle a problem in Cartesian or polar co-ordinates. 保留下来的是变量变换的概念以及向量的某些表示对于特定任务可能比其他表示更有用。可以类比为在笛卡尔坐标还是极坐标下解决问题更有意义。 EDIT in response to Aksakal: 编辑以回应阿克萨卡尔 The vector can’t be perfectly preserved because of the changes in the numbers of coordinates. However, in some sense at least the useful information may be preserved under transformation. For example with PCA we may drop a co-ordinate, so we can’t invert the transformation but the dimensionality reduction may be useful nonetheless. If all the successive transformations were invertible, you could map back from the penultimate layer to input space. As it is, I’ve only seen probabilistic models which enable that (RBMs) by sampling. 由于坐标数量的变化向量无法被完美保留。然而在某种意义上至少有用的信息在变换下可能被保留。例如在主成分分析中我们可能会丢弃一个坐标因此我们无法反转变换但降维可能仍然有用。如果所有连续的变换都是可逆的你可以从倒数第二层映射回输入空间。事实上我只见过通过采样实现这一点的概率模型受限玻尔兹曼机。 edited Feb 25, 2016 at 22:34 conjectures In the context of neural networks I had always assumed tensors were acting just as multidimensional arrays. Can you elaborate on how the invariance properties are aiding classification/representation? 在神经网络的背景下我一直认为张量只是作为多维数组发挥作用。你能详细说明一下不变性如何帮助分类/表示吗 —— Y. S. Commented Feb 25, 2016 at 16:02 Maybe I wasn’t clear above, but it seems to me - if the interpretation is correct - the goal of invariant properties has been dropped. What seems to have been kept is the idea of variable transformations. 也许我上面说得不清楚但在我看来——如果这种解释是正确的——不变性的目标已经被放弃了。似乎保留下来的是变量变换的思想。 —— conjectures Commented Feb 25, 2016 at 20:02 conjectures, if you have a vector rˉ\bar{r}rˉ in cartesian coordinates, then convert it to polar coordinates, the vector stays the same, i.e. it still point from the same point in the same direction. Are you saying that in machine learning the coordinate transformation changes the initial vector? 猜想如果你在笛卡尔坐标中有一个向量 rˉ\bar{r}rˉ然后将其转换为极坐标这个向量保持不变即它仍然从同一点指向同一方向。你是说在机器学习中坐标变换会改变初始向量吗 —— Aksakal Commented Feb 25, 2016 at 20:18 but isn’t that a property of the transformation more than the tensor? At least with linear and element-wise type transformations, which seem more popular in neural nets, they are equally present with vectors and matrices; what are the added benefits of the tensors? 但这不更是变换的性质而非张量的性质吗至少对于线性和元素级的变换在神经网络中似乎更流行它们在向量和矩阵中同样存在张量的额外好处是什么 —— Y. S. Commented Feb 26, 2016 at 9:47 conjectures, PCA is just a rotation and projection. It’s like rotating N-dimensional space to PC basis, then projecting to sub-space. Tensors are used in similar situations in physics, e.g. when looking at forces on the surfaces inside bodies etc. 猜想主成分分析只是一种旋转和投影。这就像将N维空间旋转到主成分基然后投影到子空间。张量在物理学中的类似情况下使用例如研究物体内部表面上的力等。 —— Aksakal Commented Feb 26, 2016 at 12:32 Here is a lightly edited (for context) excerpt from Non-Negative Tensor Factorization with Applications to Statistics and Computer Vision, A. Shashua and T. Hazan_ which gets to the heart of why at least some people are fascinated with tensors. 以下是《非负张量分解及其在统计学和计算机视觉中的应用》A. Shashua和T. Hazan著…中的一段为上下文略作编辑它道出了至少一部分人对张量着迷的核心原因。 Any n-dimensional problem can be represented in two dimensional form by concatenating dimensions. Thus for example, the problem of finding a non-negative low rank decomposition of a set of images is a 3-NTF (Non-negative Tensor Factorization), with the images forming the slices of a 3D cube, but can also be represented as an NMF (Non-negative Matrix Factorization) problem by vectorizing the images (images forming columns of a matrix). 任何n维问题都可以通过连接维度以二维形式表示。例如寻找一组图像的非负低秩分解的问题是一个三维非负张量分解3-NTF问题图像形成三维立方体的切片但也可以通过将图像向量化图像形成矩阵的列表示为非负矩阵分解NMF问题。 There are two reasons why a matrix representation of a collection of images would not be appropriate: 图像集合的矩阵表示之所以不合适有两个原因 Spatial redundancy (pixels, not necessarily neighboring, having similar values) is lost in the vectorization thus we would expect a less efficient factorization, and 空间冗余不一定相邻的、具有相似值的像素在向量化过程中丢失因此我们预计分解效率会较低 An NMF decomposition is not unique therefore even if there exists a generative model (of local parts) the NMF would not necessarily move in that direction, which has been verified empirically by Chu, M., Diele, F., Plemmons, R., Ragni, S. “Optimality, computation and interpretation of nonnegative matrix factorizations” SIAM Journal on Matrix Analysis, 2004. For example, invariant parts on the image set would tend to form ghosts in all the factors and contaminate the sparsity effect. An NTF is almost always unique thus we would expect the NTF scheme to move towards the generative model, and specifically not be influenced by invariant parts. 非负矩阵分解不具有唯一性因此即使存在局部部分的生成模型非负矩阵分解也不一定会朝着该方向进行这一点已由Chu, M.、Diele, F.、Plemmons, R.和Ragni, S.在《非负矩阵分解的最优性、计算和解释》《SIAM矩阵分析期刊》2004年中通过实证验证。例如图像集中的不变部分往往会在所有因子中形成虚影并破坏稀疏性效果。而非负张量分解几乎总是唯一的因此我们预计非负张量分解方案会朝着生成模型的方向进行特别是不受不变部分的影响。 answered Sep 3, 2016 at 14:52 Mark L. Stone Qualitatively, what is the difference between a matrix and a tensor? 从定性角度看矩阵和张量的区别是什么 edited Sep 21, 2015 at 13:00 rubik asked Sep 21, 2015 at 1:49 Life_student Qualitatively (or mathematically “light”), could someone describe the difference between a matrix and a tensor? I have only seen them used in the context of an undergraduate, upper level classical mechanics course, and within that context, I never understood the need to distinguish between matrices and tensors. They seemed like identical mathematical entities to me. 从定性角度或说 “浅层次” 的数学角度有人能描述一下矩阵和张量的区别吗我只在本科高年级的经典力学课程中见过它们的应用在那个语境下我始终不明白为什么要区分矩阵和张量。在我看来它们似乎是完全相同的数学实体。 Just as an aside, my math background is roughly the one of a typical undergraduate physics major (minus the linear algebra). 顺便说一下我的数学背景大概相当于典型的本科物理专业学生除了线性代数知识稍欠缺。 edited Sep 21, 2015 at 13:00 rubik asked Sep 21, 2015 at 1:49 Life_student Answers Coordinate-wise, one could say that a matrix is a “square” of numbers, while a tensor is an n-dimensional block of numbers. But this is horrible, not insightful and even a bit wrong, since those coordinates must “change in appropriate ways” (this is part of why this is horrible). 从坐标角度来说可以认为矩阵是一个 “方形” 的数字集合而张量是一个 n 维的数字块。但这种说法很糟糕、毫无洞见甚至有点错误因为这些坐标必须 “以恰当的方式变化”这也是这种说法糟糕的部分原因。 It may be best to think as follows: given a vector space VVV, a matrix can be seen in an adequate way as a bilinear map V∗×V→RV^* \times V \to \mathbb{R}V∗×V→R (since you asked for it, I’ll not enter into details. Here, V∗V^*V∗ is the dual of VVV). A tensor can be interpreted as a multilinear map V∗×⋯×V∗×V×⋯×V→RV^* \times \dots \times V^* \times V \times \dots \times V \to \mathbb{R}V∗×⋯×V∗×V×⋯×V→R (not necessarily the same quantity of V∗V^*V∗s and VVV’s). 或许更好的理解方式是给定一个向量空间 VVV矩阵可以被恰当地视为一个双线性映射 V∗×V→RV^* \times V \to \mathbb{R}V∗×V→R既然你问到了我就不展开细节了。这里的 V∗V^*V∗ 是 VVV 的对偶空间。而张量可以被解释为一个多线性映射 V∗×⋯×V∗×V×⋯×V→RV^* \times \dots \times V^* \times V \times \dots \times V \to \mathbb{R}V∗×⋯×V∗×V×⋯×V→RV∗V^*V∗ 和 VVV 的数量不一定相同。 Hence, a matrix is a kind of tensor. But tensors are more general. 因此矩阵是张量的一种。但张量更为通用。 answered Sep 21, 2015 at 2:16 Aloizio Macedo♦ instead of horrible adjective, use a little more complex one : P 与其用 “糟糕” 这个词不如用稍微复杂一点的表述嘛 : P – janmarqz Commented Sep 23, 2015 at 21:50 A rank 0 tensor is a scalar. 秩为 0 的张量是标量。 A rank 1 tensor is a row or column vector. 秩为 1 的张量是行向量或列向量。 A rank 2 tensor is a matrix, often square. 秩为 2 的张量是矩阵通常是方阵。 A rank 3 tensor? Think 3D matrix. Instead of a rectangle with data entries for each column and row, think of a cube. 秩为 3 的张量呢想象一个三维矩阵。不用再想每行每列都有数据的矩形而是想象一个立方体。 Rank 4… go 4D! 秩为 4 的张量…… 就想象四维的情况吧 edited Sep 21, 2015 at 13:07 answered Sep 21, 2015 at 2:11 zahbaz (edited by psmears) Matrices are a special type of tensor, rank 2. Scalars, vectors, matrices, are all tensors. 矩阵是一种特殊类型的张量秩为 2。标量、向量、矩阵都是张量。 Honestly, tensors are so general that the vast majority of things you deal with in your class are tensors. 老实说张量是如此普遍你在课堂上处理的绝大多数事情都是张量。 answered Sep 21, 2015 at 2:15 user223391 张量及其特例标量、向量与矩阵 1. 张量的基本概念 1.1 张量的定义张量是一个用于表示多维数据和关系的数学对象可被视为多维数组的抽象。其核心特征是阶数order有时也称为 “秩”即描述张量元素所需的索引数量。 1.2 张量的普遍性张量在数学、物理、工程及计算机科学尤其是深度学习中广泛应用可表示输入数据、权重、物理量如应力、电磁场等。在科学课程中许多常见对象本质上是张量例如线性代数中的矩阵、物理学中的向量场、深度学习中的多维数组等。 2. 张量的阶数与特例张量的阶数由其维度数量即索引数量决定不同阶数的张量对应着我们熟悉的数学对象张量阶数对应数学对象定义与特征0 阶标量- 单个数字无维度 - 无索引如温度25℃、质量5kg1 阶向量- 一维数组需 1 个索引定位元素 - 可表示为行向量或列向量如 KaTeX parse error: Cant use function $ in math mode at position 31: … a_1, a_2, a_3 $̲end {bmatrix}$ …begin {bmatrix} a_1 a2a_2 a2 a_3 endbmatrixend {bmatrix}endbmatrix2 阶矩阵- 二维数组需 2 个索引行索引 iii、列索引 jjj定位元素 - 表示为 $m timesntimes ntimesn 的二维结构mmm 和 nnn 可不等矩形矩阵也可相等正方形矩阵- 数学符号记为 AijA_{ij}Aij3 阶及以上高阶张量- 三维及以上数组需 3 个及以上索引定位元素 - 可类比为 “立方体”3 阶或更高维结构如深度学习中的批量图像数据样本数 × 高度 × 宽度 × 通道数4 阶张量 3. “阶数” 与 “矩阵的秩” 张量的阶数描述张量的维度数量索引数量是张量的固有属性如矩阵必为 2 阶张量。线性代数中 “矩阵的秩”指矩阵中线性独立的行或列向量的最大数量是一个标量值如 3×43 \times 43×4 矩阵的秩最大为 3。注意为避免混淆张量语境中通常用 “阶数”order而非 “秩” 来描述维度数量。总结标量、向量和矩阵都可以被视为张量的特例。张量的阶数由其维度数量索引数量决定是描述多维数据的通用工具。区分“张量的阶数”与“矩阵的秩”是理解张量概念的关键。标量标量是零维数组没有索引秩为 0。它是一个单独的数字例如温度25℃或质量5kg。向量向量是一维数组有一个索引秩为 1。它可以表示为行向量或列向量例如 [a1a2a3]\begin{bmatrix} a_1 a_2 a_3 \end{bmatrix}[a1a2a3] 或 [a1a2a3]\begin{bmatrix} a_1 \\ a_2 \\ a_3 \end{bmatrix}a1a2a3。矩阵矩阵是二维数组有两个索引秩为 2。它是一个 m×nm \times nm×n 的二维结构例如 [a11a12a21a22]\begin{bmatrix} a_{11} a_{12} \\ a_{21} a_{22} \end{bmatrix}[a11a21a12a22]。因此标量、向量和矩阵都可以看作是张量的特例分别对应 0 阶、1 阶和 2 阶张量。 via: What’s the difference between a matrix and a tensor? | by Steven Steinke | Medium https://medium.com/quantumsteinke/whats-the-difference-between-a-matrix-and-a-tensor-4505fbdc576c Difference Between Scalar, Vector, Matrix and Tensor - GeeksforGeeks https://www.geeksforgeeks.org/machine-learning/difference-between-scalar-vector-matrix-and-tensor/ matrices - What are the Differences Between a Matrix and a Tensor? -asked Jun 5, 2013 at 21:52 Aurelius https://math.stackexchange.com/questions/412423/what-are-the-differences-between-a-matrix-and-a-tensor machine learning - Why the sudden fascination with tensors? - Cross Validated edited Jul 4, 2016 at 5:30 https://stats.stackexchange.com/questions/198061/why-the-sudden-fascination-with-tensors linear algebra - Qualitatively, what is the difference between a matrix and a tensor? - asked Sep 21, 2015 at 1:49 Life_student https://math.stackexchange.com/questions/1444412/qualitatively-what-is-the-difference-between-a-matrix-and-a-tensor

查看全文

http://www.pierceye.com/news/518075/