Print this page
11859 need swapgs mitigation
Reviewed by: Robert Mustacchi <rm@fingolfin.org>
Reviewed by: Dan McDonald <danmcd@joyent.com>
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Robert Mustacchi <rm@fingolfin.org>

Split Close
Expand all
Collapse all
          --- old/usr/src/uts/i86pc/os/cpuid.c
          +++ new/usr/src/uts/i86pc/os/cpuid.c
↓ open down ↓ 902 lines elided ↑ open up ↑
 903  903   * -----------------------------------------------
 904  904   *
 905  905   * With the advent of the Spectre and Meltdown attacks which exploit speculative
 906  906   * execution in the CPU to create side channels there have been a number of
 907  907   * different attacks and corresponding issues that the operating system needs to
 908  908   * mitigate against. The following list is some of the common, but not
 909  909   * exhaustive, set of issues that we know about and have done some or need to do
 910  910   * more work in the system to mitigate against:
 911  911   *
 912  912   *   - Spectre v1
      913 + *   - swapgs (Spectre v1 variant)
 913  914   *   - Spectre v2
 914  915   *   - Meltdown (Spectre v3)
 915  916   *   - Rogue Register Read (Spectre v3a)
 916  917   *   - Speculative Store Bypass (Spectre v4)
 917  918   *   - ret2spec, SpectreRSB
 918  919   *   - L1 Terminal Fault (L1TF)
 919  920   *   - Microarchitectural Data Sampling (MDS)
 920  921   *
 921  922   * Each of these requires different sets of mitigations and has different attack
 922  923   * surfaces. For the most part, this discussion is about protecting the kernel
 923  924   * from non-kernel executing environments such as user processes and hardware
 924  925   * virtual machines. Unfortunately, there are a number of user vs. user
 925  926   * scenarios that exist with these. The rest of this section will describe the
 926  927   * overall approach that the system has taken to address these as well as their
 927  928   * shortcomings. Unfortunately, not all of the above have been handled today.
 928  929   *
 929      - * SPECTRE FAMILY (Spectre v2, ret2spec, SpectreRSB)
      930 + * SPECTRE v2, ret2spec, SpectreRSB
 930  931   *
 931  932   * The second variant of the spectre attack focuses on performing branch target
 932  933   * injection. This generally impacts indirect call instructions in the system.
 933  934   * There are three different ways to mitigate this issue that are commonly
 934  935   * described today:
 935  936   *
 936  937   *  1. Using Indirect Branch Restricted Speculation (IBRS).
 937  938   *  2. Using Retpolines and RSB Stuffing
 938  939   *  3. Using Enhanced Indirect Branch Restricted Speculation (EIBRS)
 939  940   *
↓ open down ↓ 88 lines elided ↑ open up ↑
1028 1029   * that a late loaded microcode may not end up in the optimal configuration
1029 1030   * (though this should be rare).
1030 1031   *
1031 1032   * Currently we do not build kmdb with retpolines or perform any additional side
1032 1033   * channel security mitigations for it. One complication with kmdb is that it
1033 1034   * requires its own retpoline thunks and it would need to adjust itself based on
1034 1035   * what the kernel does. The threat model of kmdb is more limited and therefore
1035 1036   * it may make more sense to investigate using prediction barriers as the whole
1036 1037   * system is only executing a single instruction at a time while in kmdb.
1037 1038   *
1038      - * SPECTRE FAMILY (v1, v4)
     1039 + * SPECTRE v1, v4
1039 1040   *
1040 1041   * The v1 and v4 variants of spectre are not currently mitigated in the
1041 1042   * system and require other classes of changes to occur in the code.
1042 1043   *
     1044 + * SPECTRE v1 (SWAPGS VARIANT)
     1045 + *
     1046 + * The class of Spectre v1 vulnerabilities aren't all about bounds checks, but
     1047 + * can generally affect any branch-dependent code. The swapgs issue is one
     1048 + * variant of this. If we are coming in from userspace, we can have code like
     1049 + * this:
     1050 + *
     1051 + *      cmpw    $KCS_SEL, REGOFF_CS(%rsp)
     1052 + *      je      1f
     1053 + *      movq    $0, REGOFF_SAVFP(%rsp)
     1054 + *      swapgs
     1055 + *      1:
     1056 + *      movq    %gs:CPU_THREAD, %rax
     1057 + *
     1058 + * If an attacker can cause a mis-speculation of the branch here, we could skip
     1059 + * the needed swapgs, and use the /user/ %gsbase as the base of the %gs-based
     1060 + * load. If subsequent code can act as the usual Spectre cache gadget, this
     1061 + * would potentially allow KPTI bypass. To fix this, we need an lfence prior to
     1062 + * any use of the %gs override.
     1063 + *
     1064 + * The other case is also an issue: if we're coming into a trap from kernel
     1065 + * space, we could mis-speculate and swapgs the user %gsbase back in prior to
     1066 + * using it. AMD systems are not vulnerable to this version, as a swapgs is
     1067 + * serializing with respect to subsequent uses. But as AMD /does/ need the other
     1068 + * case, and the fix is the same in both cases (an lfence at the branch target
     1069 + * 1: in this example), we'll just do it unconditionally.
     1070 + *
     1071 + * Note that we don't enable user-space "wrgsbase" via CR4_FSGSBASE, making it
     1072 + * harder for user-space to actually set a useful %gsbase value: although it's
     1073 + * not clear, it might still be feasible via lwp_setprivate(), though, so we
     1074 + * mitigate anyway.
     1075 + *
1043 1076   * MELTDOWN
1044 1077   *
1045 1078   * Meltdown, or spectre v3, allowed a user process to read any data in their
1046 1079   * address space regardless of whether or not the page tables in question
1047 1080   * allowed the user to have the ability to read them. The solution to meltdown
1048 1081   * is kernel page table isolation. In this world, there are two page tables that
1049 1082   * are used for a process, one in user land and one in the kernel. To implement
1050 1083   * this we use per-CPU page tables and switch between the user and kernel
1051 1084   * variants when entering and exiting the kernel.  For more information about
1052 1085   * this process and how the trampolines work, please see the big theory
↓ open down ↓ 99 lines elided ↑ open up ↑
1152 1185   * would have to issue an inter-processor interrupt (IPI) to the other thread.
1153 1186   * Rather than implement this, we recommend that one disables hyper-threading
1154 1187   * through the use of psradm -aS.
1155 1188   *
1156 1189   * SUMMARY
1157 1190   *
1158 1191   * The following table attempts to summarize the mitigations for various issues
1159 1192   * and what's done in various places:
1160 1193   *
1161 1194   *  - Spectre v1: Not currently mitigated
     1195 + *  - swapgs: lfences after swapgs paths
1162 1196   *  - Spectre v2: Retpolines/RSB Stuffing or EIBRS if HW support
1163 1197   *  - Meltdown: Kernel Page Table Isolation
1164 1198   *  - Spectre v3a: Updated CPU microcode
1165 1199   *  - Spectre v4: Not currently mitigated
1166 1200   *  - SpectreRSB: SMEP and RSB Stuffing
1167      - *  - L1TF: spec_uarch_flush, smt exclusion, requires microcode
     1201 + *  - L1TF: spec_uarch_flush, SMT exclusion, requires microcode
1168 1202   *  - MDS: x86_md_clear, requires microcode, disabling hyper threading
1169 1203   *
1170 1204   * The following table indicates the x86 feature set bits that indicate that a
1171 1205   * given problem has been solved or a notable feature is present:
1172 1206   *
1173 1207   *  - RDCL_NO: Meltdown, L1TF, MSBDS subset of MDS
1174 1208   *  - MDS_NO: All forms of MDS
1175 1209   */
1176 1210  
1177 1211  #include <sys/types.h>
↓ open down ↓ 6145 lines elided ↑ open up ↑
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX