0% found this document useful (0 votes)

120 views20 pages

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Eigen is a C++ template library for linear algebra that provides simple interfaces and good performance. It uses expression templates and lazy evaluation to avoid unnecessary temporary objects and vectorizes operations using SIMD instructions. Eigen can handle dense and sparse matrices and vectors and includes linear algebra algorithms, geometry functions, and other features. Benchmarks show it performs comparably to optimized libraries like MKL and GotoBLAS.

Uploaded by

mizzlez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

120 views20 pages

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

mizzlez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Eigen: A C++ Linear

Algebra Template Library

Md Ashiqur Rahman
Outline
● Introduction & Motivation
● How it works
● Implementation of Eigen
- Expression templates, Lazy evaluation, Vectorization
● Aliasing problems
● Platforms
● Eigen vs BLAS/Lapack
● Benchmark
● Conclusion
Introduction
● A C++ template library for linear algebra

● Header only, nothing to install or compile

● Provide good speed, simple interface and use

● Opensource
Why Another Library
● Multiplatform and Good compiler support
● A single unified library
● Most libraries specialized in one of the features or module
● Eigen satisfy all these criteria

-free, fast, versatile, reliable, decent API, support for both sparse and
dense matrices, vectors and array, linear algebra algorithms (LU, QR, ...),
geometric transformations.
How it works
● Takes 3 compulsory and 3 optional arguments
Matrix<typename Scalar,

int RowsAtCompileTime,

int ColsAtCompileTime,
int Options = 0,

int MaxRowsAtCompileTime = RowsAtCompileTime,

int MaxColsAtCompileTime = ColsAtCompileTime>

● Could be different types

typedef Matrix<float, 4, 4> Matrix4f;
typedef Matrix<double, Dynamic, Dynamic> MatrixXd;
typedef Matrix<float, 3, 1> Vector3f;

typedef Matrix<int, 1, 2> RowVector2i;

Eigen Implementation: 1D array
● Simple matrix addition example
int size = 50;
Eigen::VectorXf u(size), v(size), w(size);
u = v + w;

● Use one dimensional array, one loop to traverse the array

for(int i = 0; i < 50; ++i)
u[i] = v[i] + w[i];
Eigen Implementation: use expression template
● Addition should be done using temporary object
VectorXf tmp = v + w;
VectorXf u = tmp;
for(int i = 0; i < size; i++) tmp[i] = v[i] + w[i];
for(int i = 0; i < size; i++) u[i] = tmp[i];

● Eigen uses expression template to prevent unnecessary use of temporary

objects.
for(int i = 0; i < size; i++) u[i] = v[i] + w[i];
Eigen Implementation: lazy evaluation
● Intelligent lazy evaluation of expressions.

● Exceptions:
- Matrix product
- Nested expressions

matrix1 = matrix2 + matrix3 * matrix4;

- If cost model results to choose immediate evaluation

matrix1 = matrix2 * (matrix3 + matrix4);

Eigen Implementation: lazy or immediate evaluation
● Assignment operator implementation (=)
template<typename Derived>
template<typename OtherDerived>
inline Derived& MatrixBase<Derived>
::operator=(const MatrixBase<OtherDerived>& other)
{
return internal::assign_selector<Derived,OtherDerived>::run(derived(), other.derived());
}

● Internal::assign_selector
template<typename Derived, typename OtherDerived,
bool EvalBeforeAssigning = int(OtherDerived::Flags) & EvalBeforeAssigningBit,
bool NeedToTranspose = Derived::IsVectorAtCompileTime
&& OtherDerived::IsVectorAtCompileTime
&& int(Derived::RowsAtCompileTime) == int(OtherDerived::ColsAtCompileTime)
&& int(Derived::ColsAtCompileTime) == int(OtherDerived::RowsAtCompileTime)
&& int(Derived::SizeAtCompileTime) != 1>
struct internal::assign_selector;
Eigen Implementation: Automatic vectorization
● Does automatic vectorization by itself, not compiler dependent.

● Different vectorization for different architecture

● SIMD instruction sets SSE2, AltiVect, ARM NEON

Eigen Implementation: Automatic vectorization
● SSE, NEON works with 16 bytes packets.

● 4 floats or ints or 2 doubles per packets.

● 4 Addition per packets

● Our vector size 50,

for(int i = 0; i < 4*(size/4); i+=4) u.packet(i) = v.packet(i) + w.packet(i);
for(int i = 4*(size/4); i < size; i++) u[i] = v[i] + w[i];
Eigen Implementation: which vectorization to use
● Implemented in an helper class internal::assign_traits
enum {
StorageOrdersAgree = (int(Derived::IsRowMajor) == int(OtherDerived::IsRowMajor)),
MightVectorize = StorageOrdersAgree && (int(Derived::Flags) & int(OtherDerived::Flags) & ActualPacketAccessBit),
MayInnerVectorize = MightVectorize && int(InnerSize)!=Dynamic && int(InnerSize)%int(PacketSize)==0
&& int(DstIsAligned) && int(SrcIsAligned),
MayLinearize = StorageOrdersAgree && (int(Derived::Flags) & int(OtherDerived::Flags) & LinearAccessBit),
MayLinearVectorize = MightVectorize && MayLinearize && DstHasDirectAccess && (DstIsAligned || MaxSizeAtCompileTime==Dynamic),
MaySliceVectorize = MightVectorize && DstHasDirectAccess && (int(InnerMaxSize)==Dynamic || int(InnerMaxSize)>=3*PacketSize)
};
Eigen Implementation: Linear Vectorization implementation
● Need to skip first few coefficients to group coefficients by packets of 4.
● First, determine architecture specific packet size
const int packetSize = internal::packet_traits<typename Derived1::Scalar>::size;

● Start of first coefficient

const int alignedStart = internal::assign_traits<Derived1,Derived2>::DstIsAligned ? 0 :
internal::first_aligned(&dst.coeffRef(0), size);

● Skipping coefficients

for(int index = 0; index < alignedStart; index++)

dst.copyCoeff(index, src);
Eigen Implementation: Linear Vectorization implementation
● Vector size 50 is not multiple of packet size 4 floats, 48 is the maximum
number.
const int alignedEnd = alignedStart + ((size-alignedStart)/packetSize)*packetSize;

● Vectorization part
for(int index = alignedStart; index < alignedEnd; index += packetSize)
{
dst.template copyPacket<Derived2, Aligned, internal::assign_traits<Derived1,Derived2>::
SrcAlignment>(index, src);
}

● Last two coefficients

for(int index = alignedEnd; index < size; index++)
dst.copyCoeff(index, src);
Aliasing Problem
● Occurs when a matrix operation applied on a matrix and saved in the
same matrix.
mat = mat.transpose();

● Produce wrong results.

● Solution is to use temporary variable

tmp = mat.transpose();
mat = tmp;
Platforms
● Supported compilers:

– GCC (from 3.4 to 4.6) , MSVC (2005,2008,2010) , Intel ICC, Clang/LLVM

● Supported systems:

– x86/x86_64 (Linux,Windows)

– ARM (Linux), PowerPC

● Supported SIMD vectorization engines:

– SSE2, SSE3, SSSE3, SSE4

– NEON (ARM)

– Altivec (PowerPC)
Eigen vs BLAS/Lapack
● Fixed size matrices, vectors
● Sparse matrices and vectors
● More features like Geometry module, Array module
● Most operations are faster or comparable with MKL and GOTO
● Better API
● Complex operations are faster
Benchmark
Benchmark
Conclusion
● From benchmark it shows, eigen is comparable with most linear algebra
library available.

● Simple interface make it more attractive

● Low memory overhead

● All features and modules in a single library make it more usable.

Lab 4
No ratings yet
Lab 4
3 pages
Lab 05
No ratings yet
Lab 05
4 pages
ECE OOP Lab2
No ratings yet
ECE OOP Lab2
3 pages
ECE OOP Lab4
No ratings yet
ECE OOP Lab4
4 pages
Eigen
No ratings yet
Eigen
12 pages
Object Oriented Programming Lab: Department of Computer Science and Engineering
No ratings yet
Object Oriented Programming Lab: Department of Computer Science and Engineering
46 pages
C++ Vs Fortran
No ratings yet
C++ Vs Fortran
10 pages
C++ For Scientific Computing: Mark Richardson May 2009
No ratings yet
C++ For Scientific Computing: Mark Richardson May 2009
51 pages
b22cs028 Rakesh Assignment-4
No ratings yet
b22cs028 Rakesh Assignment-4
6 pages
CSE 241 Programming Assignment 2: If You Store Values, You Get
No ratings yet
CSE 241 Programming Assignment 2: If You Store Values, You Get
7 pages
BCSL-032 (2023-24) Solved Assignment
No ratings yet
BCSL-032 (2023-24) Solved Assignment
11 pages
Oops Lab Manual
No ratings yet
Oops Lab Manual
41 pages
Oops Lab Manual
No ratings yet
Oops Lab Manual
47 pages
Lab6 - Linear Algebra in C On A Microcontroller
No ratings yet
Lab6 - Linear Algebra in C On A Microcontroller
8 pages
Blocked Matrix Multiply
No ratings yet
Blocked Matrix Multiply
6 pages
Tlapack Slides
No ratings yet
Tlapack Slides
77 pages
Transitioning To Modern C++:: An Overview of C++11/14/17 For C++98 Programmers
No ratings yet
Transitioning To Modern C++:: An Overview of C++11/14/17 For C++98 Programmers
6 pages
Lab File
No ratings yet
Lab File
22 pages
cs201 Week 9
No ratings yet
cs201 Week 9
9 pages
HW 1
No ratings yet
HW 1
5 pages
HPC-Practical-4Addition of Two Large Vectors
No ratings yet
HPC-Practical-4Addition of Two Large Vectors
4 pages
VHDL Matrix Math Guide
No ratings yet
VHDL Matrix Math Guide
10 pages
VHDL Matrix Math Guide
No ratings yet
VHDL Matrix Math Guide
10 pages
C++ Solution Design Assignment
No ratings yet
C++ Solution Design Assignment
8 pages
CS201P Assignment 2 Solution Spring 2022
No ratings yet
CS201P Assignment 2 Solution Spring 2022
8 pages
Data and Signals Lab Report 1
No ratings yet
Data and Signals Lab Report 1
16 pages
Matrix Computation On The GPU
No ratings yet
Matrix Computation On The GPU
455 pages
Matrix: Remark: The Created Matrices Are Square Matrices. (Indicated by The Number: 2, 3, 4)
No ratings yet
Matrix: Remark: The Created Matrices Are Square Matrices. (Indicated by The Number: 2, 3, 4)
10 pages
Lec03 1 Program Optimizations
No ratings yet
Lec03 1 Program Optimizations
43 pages
Mat Multipli
No ratings yet
Mat Multipli
4 pages
Numerical Methods CSE Homework
No ratings yet
Numerical Methods CSE Homework
202 pages
Tiny Project 1
No ratings yet
Tiny Project 1
2 pages
Problem Statement
No ratings yet
Problem Statement
5 pages
Java Metode Gauss Jordan
No ratings yet
Java Metode Gauss Jordan
7 pages
CUDA Matrix Multiplication Quiz
No ratings yet
CUDA Matrix Multiplication Quiz
12 pages
Proj 2
No ratings yet
Proj 2
3 pages
Program: / Implementing Class With Static Data Member
No ratings yet
Program: / Implementing Class With Static Data Member
49 pages
C++ Experiments for Students
No ratings yet
C++ Experiments for Students
36 pages
OOPs Practical Final
No ratings yet
OOPs Practical Final
27 pages
CS2209 - Oops Lab Manual
100% (1)
CS2209 - Oops Lab Manual
62 pages
HPC Unit 5 B
No ratings yet
HPC Unit 5 B
31 pages
KT 14503 Mathematics For Computing Group Assignment 20 Marks
No ratings yet
KT 14503 Mathematics For Computing Group Assignment 20 Marks
7 pages
Writing Your Own Linear Algebra Matrix Library in C - Andreinc
No ratings yet
Writing Your Own Linear Algebra Matrix Library in C - Andreinc
60 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
2 pages
Matrices
No ratings yet
Matrices
10 pages
Linear Project
No ratings yet
Linear Project
24 pages
Computer Science Discrete Mathematics
No ratings yet
Computer Science Discrete Mathematics
11 pages
Muhammad Zain 502138 BEE16 (D) Lab#10
No ratings yet
Muhammad Zain 502138 BEE16 (D) Lab#10
7 pages
LatexC++ Proposed Exercises (Chapter 7: The C++ Programing Language, Fourth Edition) - Solution
No ratings yet
LatexC++ Proposed Exercises (Chapter 7: The C++ Programing Language, Fourth Edition) - Solution
7 pages
Unit 2 Basic Optimization Techniques For Serial Code
No ratings yet
Unit 2 Basic Optimization Techniques For Serial Code
31 pages
COL380 Assignment 1
No ratings yet
COL380 Assignment 1
10 pages
Signals and Systems Lab 1
No ratings yet
Signals and Systems Lab 1
14 pages
One Voice Operations Center Iom Manual Ver 84
No ratings yet
One Voice Operations Center Iom Manual Ver 84
347 pages
Linux Kernel: Enhancements & Impact
No ratings yet
Linux Kernel: Enhancements & Impact
1 page
Chris Bryant - S CCNP ROUTE 300-101 Study Guide
100% (2)
Chris Bryant - S CCNP ROUTE 300-101 Study Guide
450 pages
Microprocessor Lab Guide
No ratings yet
Microprocessor Lab Guide
45 pages
ConnectX-2 EN User Manual
No ratings yet
ConnectX-2 EN User Manual
48 pages
BK2461 Beken
No ratings yet
BK2461 Beken
95 pages
MX RM CameraSoftwareManual en 200131
No ratings yet
MX RM CameraSoftwareManual en 200131
557 pages
Hud Sight
No ratings yet
Hud Sight
7 pages
Minimum Hardware Requirement.: Starbase 2.2.1 (June 2012) Installation Notes
No ratings yet
Minimum Hardware Requirement.: Starbase 2.2.1 (June 2012) Installation Notes
7 pages
Post Installation Activities
No ratings yet
Post Installation Activities
44 pages
Basys 3 Master
No ratings yet
Basys 3 Master
4 pages
Virtualization Administrator-Cc
No ratings yet
Virtualization Administrator-Cc
1 page
Virtuoso Lvs DRC v4
No ratings yet
Virtuoso Lvs DRC v4
23 pages
Experiment - I Addition/Subtraction of An Array of 16-Bit Numbers
No ratings yet
Experiment - I Addition/Subtraction of An Array of 16-Bit Numbers
7 pages
Document Management System - User Guide: Software Documentation
No ratings yet
Document Management System - User Guide: Software Documentation
30 pages
DC4xD MK3 MODBUS - Communication Protocol - V1.0-20210708
No ratings yet
DC4xD MK3 MODBUS - Communication Protocol - V1.0-20210708
14 pages
7.3.7 Lab - View The Switch MAC Address Table
No ratings yet
7.3.7 Lab - View The Switch MAC Address Table
5 pages
Ozcopper: Cpu Gold Content
No ratings yet
Ozcopper: Cpu Gold Content
7 pages
Updated - CV - Jaydip 11
No ratings yet
Updated - CV - Jaydip 11
4 pages
Ex 2200
No ratings yet
Ex 2200
37 pages
Valid Competency Based Curriculum (CBC) Assessment
No ratings yet
Valid Competency Based Curriculum (CBC) Assessment
7 pages
Computer Architecture and Operating System
No ratings yet
Computer Architecture and Operating System
1 page
Elementarna Matematika 2 - Popravni Kolokvij
No ratings yet
Elementarna Matematika 2 - Popravni Kolokvij
4 pages
ExtremeXOS Upgrading The BootROM
No ratings yet
ExtremeXOS Upgrading The BootROM
2 pages
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
No ratings yet
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
51 pages
HCNA HNTD V2 1 Intermediate Training Materials PDF
No ratings yet
HCNA HNTD V2 1 Intermediate Training Materials PDF
362 pages
Rain Technology Seminar
67% (3)
Rain Technology Seminar
18 pages
Components of HDD and SSD - 20250121 - 210331 - 0000
No ratings yet
Components of HDD and SSD - 20250121 - 210331 - 0000
20 pages
Com - Upgadata.zhushou32 Logcat
No ratings yet
Com - Upgadata.zhushou32 Logcat
32 pages
HP Support Services Quote 2006-2007
No ratings yet
HP Support Services Quote 2006-2007
4 pages

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

Eigen: A C++ Linear

Algebra Template Library

● Header only, nothing to install or compile

● Provide good speed, simple interface and use

int MaxRowsAtCompileTime = RowsAtCompileTime,

● Could be different types

typedef Matrix<int, 1, 2> RowVector2i;​

● Use one dimensional array, one loop to traverse the array

● Eigen uses expression template to prevent unnecessary use of temporary

matrix1 = matrix2 + matrix3 * matrix4;

- If cost model results to choose immediate evaluation

matrix1 = matrix2 * (matrix3 + matrix4);

● Different vectorization for different architecture

● SIMD instruction sets SSE2, AltiVect, ARM NEON

● 4 floats or ints or 2 doubles per packets.

● 4 Addition per packets

● Our vector size 50,

● Start of first coefficient

for(int index = 0; index < alignedStart; index++)

● Last two coefficients

● Produce wrong results.

● Solution is to use temporary variable

– GCC (from 3.4 to 4.6) , MSVC (2005,2008,2010) , Intel ICC, Clang/LLVM

– ARM (Linux), PowerPC

● Supported SIMD vectorization engines:

– SSE2, SSE3, SSSE3, SSE4

● Simple interface make it more attractive

● Low memory overhead

● All features and modules in a single library make it more usable.

You might also like

typedef Matrix<int, 1, 2> RowVector2i;