x86 中所稱的 byte,word 和 double word 分別為 8、16 和 32 位。
* [[http://www.tektalk.org/2010/03/11/%e5%8f%8c%e8%8a%af%e8%ae%b0a-tale-of-two-chips/|双芯记(A Tale of Two Chips)]]
* [[wp>Intel iAPX 432]]。[(http://people.cs.nctu.edu.tw/~chenwj/log/LLVM/majkl-2011-12-08.txt)]
* [[http://www.tektalk.org/2010/01/02/%e9%93%81%e8%87%82%e9%98%bf%e7%ab%a5%e6%9c%a8%e2%80%94%e2%80%94intel-atom%e5%a4%84%e7%90%86%e5%99%a8%e5%89%96%e6%9e%90%e4%b8%8e%e7%a0%94%e7%a9%b6/|铁臂阿童木——Intel ATOM处理器剖析与研究]]
* [[http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-software-developer-vol-1-manual.pdf|Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture]]
* [[http://bbs.chinaunix.net/thread-2012474-1-1.html|【x64 指令系统】之 指令编码内幕]]
* [[http://www.mouseos.com/x64/index.html|x86/x64 指令编码内幕(适用于 AMD/Intel)]]
* [[http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-software-developer-vol-2a-manual.pdf|Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2A: Instruction Set Reference, A-M]]
* [[http://blog.csdn.net/misterliwei/article/details/5550452|保护模式、实地址模式及V8086模式下的指令格式(上)]]
* [[http://blog.csdn.net/misterliwei/article/details/3951103|PAUSE指令]]
* [[wp>x86 assembly language]]
* no-temproal instruciton
* [[http://stackoverflow.com/questions/37070/what-is-the-meaning-of-non-temporal-memory-accesses-in-x86|What is the meaning of "non temporal" memory accesses in x86]]
* [[http://www.rz.uni-karlsruhe.de/rz/docs/VTune/reference/vc198.htm]]
* [[http://groups.google.com/group/comp.lang.asm.x86/browse_thread/thread/2ae6c66f8e69ae82/1950f09a79cd4056]]
x86_64 地址線只有 48 bit 而非 64 bit。[[wp>x86-64]]
*[[http://blog.csdn.net/muxiqingyang/article/details/6791218|《大话处理器》连载——微架构(22) Superscalar处理器实例——Intel P4 CPU]]
* [[http://www.ithome.com.tw/itadm/article.php?c=38381&s=1|深度剖析英特爾旗艦處理器-走出隧道盡頭的Itanium]]
* [[http://www.diybl.com/course/3_program/hb/hbjs/2007124/89933.html|从X86指令RET和CALL的意义看进程的自由切换]]
* [[http://people.cs.nctu.edu.tw/~huangmc/works/web/Boot_x86/Boot_x86.html|X86 開機流程小記]]
* [[http://henbin.blogspot.com/2009/02/x86-reset-concept.html|X86 Reset concept]]
MMIO 需要做位址轉換。
* [[http://www.embexperts.com/viewthread.php?tid=65|X86 IO端口和MMIO]]
A20 地址線只要是為了向後相容,將存取到 1MB 以上的位址改以從 0 開始。
* [[wp>DOS memory management]]
* [[http://software.intel.com/en-us/blogs/2012/02/07/transactional-synchronization-in-haswell/|Transactional Synchronization in Haswell]]
* [[http://www.pagetable.com/?p=364|Why is there no CR1 – and why are control registers such a mess anyway?]]
====== CPU ======
實模式: 不做任何位址轉換,不開分段也不開分頁。
Segmentation provides a mechanism of isolating individual code, data, and stack modules so that multiple programs (or tasks) can run on the same processor without interfering with one another. Paging provides a mechanism for implementing a conventional demand-paged, virtual-memory system where sections of a program’s execution environment are mapped into physical memory as needed. Paging can also be used to provide isolation between multiple tasks.邏輯地址 (logical address) 分為兩部分: 16 位的段選擇符 (segement selector) 和 32 位的偏移 (offset)。 - 用段選擇符索引段描述符表,取得段描述符。段選擇符包含: 索引、TI (索引 GDT 或是 LDT) 和 RPL。判定是否有權限存取該段,是取 CPL 和 RPL 中 - 將段描述符中的段基址和偏移相加,得到線性位址。 段暫存器: CS、DS、ES、FS、GS、SS。 * [[http://www.csie.ntu.edu.tw/~wcchen/asm98/asm/proj/b85506061/chap2/segment.html|分段架構]] * [[http://www.mouseos.com/arch/segmentation.html|segmentation 情景分析(上篇)--- 数据结构]] * [[http://www.mouseos.com/arch/segmentation_protected.html|segmentation 情景分析(下篇)--- protected 机制]] ===== 分頁 ===== * [[http://www.mouseos.com/arch/paging.html|理解 paging]] * [[http://www.mouseos.com/arch/page-protected.html|page 的保护措施]] * [[http://www.ibm.com/developerworks/cn/linux/l-lvm64/index.html|X86-64 上的 Linux VM 管理系统]] 頁表項有底下幾項控制位: * G: * D: * A: * C: * W: * U/S: 決定屬於用戶態或是內核態,用戶態無法存取屬於內核態的頁。 * R/W: 決定是否可寫。 * Present (P): 該頁存在物理內存。 * [[http://stackoverflow.com/questions/10671147/how-do-x86-page-tables-work|How do x86 page tables work?]] * [[http://lwn.net/Articles/106177/|Four-level page tables]] * [[http://wiki.osdev.org/Paging|Paging]] * [[http://download.intel.com/products/processor/manual/253668.pdf|Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3A: System Programming Guide, Part 1]] Chapter 4When operating in protected mode, some form of segmentation must be used. There is no mode bit to disable segmentation. The use of paging, however, is optional.
Software enables paging by using the MOV to CR0 instruction to set CR0.PG.* 32-bit paging: CR0.PG = 1 and CR4.PAE = 0 * PAE paging: CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LME = 0 * IA-32e paging: CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LME = 1 * CR0.WP * CR4.PSE * CR4.PGE * CR4.PCIDE * CR4.SMEP * IA32_EFER.NXEBefore doing so, software should ensure that control register CR3 contains the physical address of the first paging structure that the processor will use for linear-address translation (see Section 4.2) and that structure is initialized as desired.
The first paging structure used for any translation is located at the* [[http://electronics.stackexchange.com/questions/21469/are-page-table-walks-cached|Are page table walks cached?]] * [[http://www.cs.rice.edu/CS/Architecture/docs/barr-isca10.pdf|Translation Caching: Skip, Don't Walk (the Page Table)]] ===== 術語 ===== * Page Global Directory (PGD) * Page Middle directory (PMD) * Page Table Entry (PTE) * Page-Map Level 4 (PML4) * Page-directory-pointer table (PDPT) * [[wp>Physical Address Extension|Physical Address Extension (PAE)]] * [[wp>Page attribute table|Page Attribute Table (PAT)]] * [[wp>Memory type range register|Memory Type Range Register (MTRR)]] ====== 術語 ====== * Current Privilege Level (CPL) * Descriptor Privilege Level (DPL) * Request Privilege Level (RPL) * Global Descriptor Table (GDT) ====== MMX & SSE ====== MMX -> SSE -> AVX * [[http://www.mobile01.com/topicdetail.php?f=296&t=367606&last=3254788|1-3.CPU進階技術講解,XD、VT、SSE在幹嘛]] * [[http://en.wikibooks.org/wiki/X86_Assembly/SSE|X86 Assembly/SSE]] * [[http://neilkemp.us/src/sse_tutorial/sse_tutorial.html|Intel SSE Tutorial : An Introduction to the SSE Instruction Set]] * [[https://developer.apple.com/hardwaredrivers/ve/sse.html]] * [[http://saluc.engr.uconn.edu/refs/processors/intel/sse_sse2.pdf|Using SSE and SSE2: Misconcepts and Reality]] * [[http://www.formboss.net/blog/2010/10/sse-intrinsics-tutorial/]] * [[http://msdn.microsoft.com/en-us/library/y0dh78ez(v=VS.80).aspx]] * 2.3 * 16 128-bit register (xmm0 - xmm15) * [[wp>Streaming SIMD Extensions]] * PD: Packed Double * PS: Packed Single * SD: Scalar Double * SS: Scalar Single * [[wp>x86 instruction listings]] * 3.1.1.3 Instruction Column in the Opcode Summary Table * r/m32 — A doubleword general-purpose register or memory operand used for instructions whose operandsize attribute is 32 bits. The doubleword general-purpose registers are: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI. The contents of memory are found at the address provided by the effective address computation. Doubleword registers R8D - R15D are available when using REX.R in 64-bit mode * mm — An MMX register. The 64-bit MMX registers are: MM0 through MM7. * [[http://stackoverflow.com/questions/7115795/sse-instructions-in-a-buffer|SSE instructions in a buffer]]physical address in CR3.
$ gcc -O2 -ftree-vectorize -msse2 -ftree-vectorizer-verbose=5
* [[http://stackoverflow.com/questions/409300/how-to-vectorize-with-gcc|How to vectorize with gcc?]]
* [[http://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/Optimize-Options.html#Optimize-Options|3.10 Options That Control Optimization]]
* `-ftree-vectorize`
* Perform loop vectorization on trees. This flag is enabled by default at -O3.
* [[http://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options|3.17.16 Intel 386 and AMD x86-64 Options]]
* `-msse2`
* These switches enable or disable the use of instructions in the MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, F16C, FMA, SSE4A, FMA4, XOP, LWP, ABM, BMI, BMI2, LZCNT, RTM or 3DNow! extended instruction sets.
* Using GCC Auto-Vectorizer
$ gcc -mfpu=neon -mfloat-abi=softfp or -mfloat-abi=hard
* [[http://locklessinc.com/articles/vectorize/|Auto-vectorization with gcc 4.7]]
* [[http://stackoverflow.com/questions/11129159/auto-vectorization-in-llvm|Auto vectorization in llvm]]
====== 其它 ======
* [[wp>Time Stamp Counter]]
* [[http://download.intel.com/embedded/software/IA/324264.pdf|How to Benchmark Code Execution Times on Intel®
IA-32 and IA-64 Instruction Set Architectures]]
* [[http://www.ccsl.carleton.ca/~jamuir/rdtscpm1.pdf|Using the RDTSC Instruction for Performance Monitoring]]
* [[http://blog.csdn.net/solstice/article/details/5196544|多核时代不宜再用 x86 的 RDTSC 指令测试指令周期和时间]]
* [[http://blog.gslin.org/archives/2012/03/09/2847/amd-cpu-bug-%e5%95%8f%e9%a1%8c/|AMD CPU bug 問題…]]
* [[http://leaf.dragonflybsd.org/mailarchive/kernel/2011-12/msg00025.html|Buildworld loop seg-fault update -- I believe it is hardware]]
* [[http://leaf.dragonflybsd.org/mailarchive/kernel/2012-03/msg00006.html|AMD cpu bug update -- AMD confirms! (additional info)
]]
* [[http://newsletter.sigmicro.org/sigmicro-oral-history-transcripts/Bob-Colwell-Transcript.pdf|Oral history of Robert P. Colwell (1954- )]]
====== 外部連結 ======
* [[http://www.mouseos.com/index.html|mouseOS 技术小站]]
* [[http://www.csie.ntu.edu.tw/~wcchen/asm98/asm/proj/b85506061/cover.html|Intel Architecture 保護模式架構]]
* [[http://www.cs.cmu.edu/~fp/courses/15213-s07/misc/asm64-handout.pdf|x86-64 Machine-Level Programming]]
* [[http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html|Intel® 64 and IA-32 Architectures Software Developer Manuals]]
* [[http://developer.amd.com/pages/default.aspx|AMD Developer Central]]