Skip to content

MD5 transform routine optimized for x64 processors written using Macro Assembler. The performance is 4.94 CPU cycles per byte (on a typical Intel CPU like Skylake).

License

Notifications You must be signed in to change notification settings

maximmasiutin/MD5_Transform-x64

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MD5_Transform-x64

MD5 transform routine optimized for x86-64 processors written using Macro Assembler

Copyright (C) 2018-2020 Ritlabs, SRL. All rights reserved.

Copyright (C) 2020-2021 Maxim Masiutin. All rights reserved.

The 64-bit version is written by Maxim Masiutin [email protected]

Based on 32-bit code by Peter Sawatzki (see below).

The performance is 4.94 CPU cycles per byte (on Skylake).

The main advantage of this 64-bit version is that it loads 64 bytes of hashed message into 8 64-bit registers (RBP, R8, R9, R10, R11, R12, R13, R14) at the beginning, to avoid excessive memory load operations throughout the routine.

To operate with 32-bit values store in higher bits of a 64-bit register (bits 32-63) uses "Ror" by 32; 8 macro variables (M1-M8) are used to keep record or current state of whether the register has been Ror'ed or not.

It also has an ability to use LEA instruction instead of two sequential ADDs (uncomment UseLea=1), but it is slower on Skylake processors. Also, Intel in the Optimization Reference Manual discourages use of LEA as a replacement of two ADDs, since it is slower on the Atom processors.

This code is used in "The Bat!" email client https://www.ritlabs.com/en/products/thebat/

MD5_Transform-x64 is released under a dual license, and you may choose to use it under either the Mozilla Public License 2.0 (MPL 2.1, available from https://www.mozilla.org/en-US/MPL/2.0/) or the GNU Lesser General Public License Version 3, dated 29 June 2007 (LGPL 3, available from https://www.gnu.org/licenses/lgpl.html).

MD5_Transform-x64 is based on the following code by Peter Sawatzki.

The original notice by Peter Sawatzki follows.

MD5_386.Asm

386 optimized helper routine for calculating MD Message-Digest values

written 2/2/94 by

Peter Sawatzki Buchenhof 3 D58091 Hagen, Germany Fed Rep

EMail: [email protected] EMail: [email protected] WWW: http://www.sawatzki.de

original C Source was found in Dr. Dobbs Journal Sep 91 MD5 algorithm from RSA Data Security, Inc.

About

MD5 transform routine optimized for x64 processors written using Macro Assembler. The performance is 4.94 CPU cycles per byte (on a typical Intel CPU like Skylake).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published