{"id":256,"date":"2010-09-21T17:30:46","date_gmt":"2010-09-21T15:30:46","guid":{"rendered":"http:\/\/joernhees.de\/blog\/?p=256"},"modified":"2017-08-21T14:08:10","modified_gmt":"2017-08-21T12:08:10","slug":"how-to-convert-hex-strings-to-binary-ascii-strings-in-python-incl-8bit-space","status":"publish","type":"post","link":"https:\/\/joernhees.de\/blog\/2010\/09\/21\/how-to-convert-hex-strings-to-binary-ascii-strings-in-python-incl-8bit-space\/","title":{"rendered":"How to convert hex strings to binary ascii strings in python (incl. 8bit space)"},"content":{"rendered":"<p>As i come across this again and again:<\/p>\n<p>How do you turn a hex string like <code>\"c3a4c3b6c3bc\"<\/code> into a nice binary string like this: <code>\"11000011 10100100 11000011 10110110 11000011 10111100\"<\/code>?<\/p>\n<p>The solution is based on the Python 2.6 new string formatting:<\/p>\n<pre><code class=\"python\">&gt;&gt;&gt; \"{0:8b}\".format(int(\"c3\",16))\n'11000011'\n<\/code><\/pre>\n<p>Which can be decomposed into 4 bit for each hex char like this: (notice the 04b, which means 0-padded 4chars long binary string):<\/p>\n<pre><code class=\"python\">&gt;&gt;&gt; \"{0:04b}\".format(int(\"c\",16)) + \"{0:04b}\".format(int(\"3\",16))\n'11000011'\n<\/code><\/pre>\n<p>OK, now we could easily do this for all hex chars <code>\"\".join([\"{0:04b}\".format(int(c,16)) for c in \"c3a4c3b6\"])<\/code> and done, but usually we want a blank every 8 bits from the right to left&#8230; And looping from the right pairwise is a bit more complicated&#8230; Oh and what if the number of bits is uneven?<br \/>\nSo the solution looks like this:<\/p>\n<pre><code class=\"python\">&gt;&gt;&gt; binary = lambda x: \" \".join(reversed( [i+j for i,j in zip( *[ [\"{0:04b}\".format(int(c,16)) for c in reversed(\"0\"+x)][n::2] for n in [1,0] ] ) ] ))\n&gt;&gt;&gt; binary(\"c3a4c3b6c3bc\")\n'11000011 10100100 11000011 10110110 11000011 10111100'\n<\/code><\/pre>\n<p>It takes the hex string <code>x<\/code>, first of all concatenates a <code>\"0\"<\/code> to the left (for the uneven case), then reverses the string, converts every char into a 4-bit binary string, then collects all uneven indices of this list, zips them to all even indices, for each in the pairs-list concatenates them to 8-bit binary strings, reverses again and joins them together with a &#8221; &#8221; in between. In case of an even number the added 0 falls out, because there&#8217;s no one to zip with, if uneven it zips with the first hex-char.<\/p>\n<p>Yupp, I like 1liners \ud83d\ude09<\/p>\n<p>Update: Btw, it&#8217;s very easy to combine this with <code>binascii.hexlify<\/code> to get the binary representation of some byte-string:<\/p>\n<pre><code class=\"python\">&gt;&gt;&gt; import binascii\n&gt;&gt;&gt; binascii.hexlify('j\u00f6rn')\n'6ac3b6726e'\n&gt;&gt;&gt; binary(binascii.hexlify('j\u00f6rn'))\n'01101010 11000011 10110110 01110010 01101110'\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>As i come across this again and again: How do you turn a hex string like &#8220;c3a4c3b6c3bc&#8221; into a nice binary string like this: &#8220;11000011 10100100 11000011 10110110 11000011 10111100&#8221;? The solution is based on the Python 2.6 new string formatting: &gt;&gt;&gt; &#8220;{0:8b}&#8221;.format(int(&#8220;c3&#8221;,16)) &#8216;11000011&#8217; Which can be decomposed into 4 bit for each hex char [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":""},"categories":[2],"tags":[16,17,44,66,132,158,183],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pYA5n-48","jetpack-related-posts":[{"id":166,"url":"https:\/\/joernhees.de\/blog\/2010\/07\/31\/urlencoding-in-python\/","url_meta":{"origin":256,"position":0},"title":"(URL)Encoding in python","date":"2010-07-31","format":false,"excerpt":"Well, encodings are a never ending story and whenever you don't want to waste time on them, it's for sure that you'll stumble over yet another tripwire. This time it is the encoding of URLs (note: even though related I'm not talking about the urlencode function). Perhaps you have seen\u2026","rel":"","context":"In &quot;Coding&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":314,"url":"https:\/\/joernhees.de\/blog\/2010\/12\/15\/python-unicode-doctest-howto-in-a-doctest\/","url_meta":{"origin":256,"position":1},"title":"Python unicode doctest howto in a doctest","date":"2010-12-15","format":false,"excerpt":"Another thing which has been on my stack for quite a while has been a unicode doctest howto, as I remember I was quite lost when I first tried to test encoding stuff in a doctest. So I thought the ultimate way to show how to do this would be\u2026","rel":"","context":"In &quot;Coding&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":297,"url":"https:\/\/joernhees.de\/blog\/2010\/12\/14\/how-to-restrict-the-length-of-a-unicode-string\/","url_meta":{"origin":256,"position":2},"title":"How to restrict the length of a unicode string","date":"2010-12-14","format":false,"excerpt":"Ha, not with me! It's a pretty common tripwire: Imagine you have a unicode string and for whatever reason (which should be a good reason, so make sure you really need this) you need to make sure that its UTF-8 representation has at most maxsize bytes. The first and in\u2026","rel":"","context":"In &quot;Coding&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":91,"url":"https:\/\/joernhees.de\/blog\/2010\/07\/21\/sort-python-dict-by-values\/","url_meta":{"origin":256,"position":3},"title":"Sort python dictionaries by values","date":"2010-07-21","format":false,"excerpt":"Perhaps you already encountered a problem like the following one yourself: You have a large list of items (let's say URIs for this example) and want to sum up how often they were viewed (or edited or... whatever). A small one-shot solution in python looks like the following and uses\u2026","rel":"","context":"In &quot;Coding&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":566,"url":"https:\/\/joernhees.de\/blog\/2014\/02\/25\/scientific-python-on-mac-os-x-10-9-with-homebrew\/","url_meta":{"origin":256,"position":4},"title":"Scientific Python on Mac OS X 10.9+ with homebrew","date":"2014-02-25","format":false,"excerpt":"Scientific python setup guide for Mac OS X 10.9 Mavericks with homebrew","rel":"","context":"In &quot;Coding&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":19,"url":"https:\/\/joernhees.de\/blog\/2010\/06\/28\/python-and-encoding\/","url_meta":{"origin":256,"position":5},"title":"Python and encoding","date":"2010-06-28","format":false,"excerpt":"Well, first real post, so let's start easy. I've been working a lot with python lately, and came across a nice short How to Use UTF-8 with Python which also makes the difference between unicode and utf8 very clear. The howto also links to another valuable source: Characters vs. Bytes,\u2026","rel":"","context":"In &quot;Coding&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/posts\/256"}],"collection":[{"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/comments?post=256"}],"version-history":[{"count":4,"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/posts\/256\/revisions"}],"predecessor-version":[{"id":840,"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/posts\/256\/revisions\/840"}],"wp:attachment":[{"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/media?parent=256"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/categories?post=256"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/joernhees.de\/blog\/wp-json\/wp\/v2\/tags?post=256"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}