https://github.com/angr/angr Uses a Concolic execution engine where it can switch from running a binary concretely, break, and then define an unknown input and find what should I be to trigger a different breakpoint. - e.g. what should the “password” pointer be pointing to in order to trigger the “you’re in” branch of code.

Note: it still can’t reverse hashes. If you try to reverse md5 using this approach it’ll consume petabytes of RAM.

I think radare2 was looking into integrating with angr but I don’t know the status of the integration.

  • PenguinCoder@beehaw.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 year ago

    I’m an incident responder/malware analyst. Mostly do static analysis and reverse engineering. What would you say the benefit of your research and this binary analysis is compared to other offerings? What do you do about highly obfuscated or ‘benign’ looking binaries that aren’t?

  • strudel6242@beehaw.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    What was your journey like getting into this as a career? What have been some of the toughest challenges you’ve faced as a researcher? Why did you specialise in automated binary analysis?

  • RoaringSilence@kbin.social
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    Thanks for doing this ama!

    Without revealing to much, what are your customers or is it pure research based?

    A second question, is the code generated vulnerable often because using certain programming languages that have “known” problems or are the problems coming mostly from bad coding habits?

    • Hexorg@beehaw.orgOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      I was associated with this so you can infer clients from there.

      Overall - no, even memory-safe languages can let you write vulnerable code. Heck even SQL which is a database query language can have SQL injections. Developers write code to reason over infinite possible data. We can’t reason over infinite data so we use assumptions about it. Vulnerabilities happen when our assumptions can be broken. Theoretically if you formalize all of your assumptions you can have a computer check if those assumptions hold, but then what if you forgot to list an assumption? There are infinite amount of possible assumptions too so even fully formalized approaches can’t help you 100% (though they can make your code a lot more resilient).

      Better coding practices essentially help developers manage assumptions better. But what happens if the requirement changed and you didn’t account for old assumptions in the new code? Or what if you’re the new developer and you don’t know what assumptions the code holds? It’s hard. Automation can make it easier, but I doubt it’ll ever be 100% non vulnerable code.