For the last two weeks, I’ve trying to write Piggy patterns to construct a symbol table from a Java AST. Patch after patch, I’d change the pattern matching code to “fix” something that wasn’t working. Unfortunately, I finally wrote a pattern that broke the camel’s back, “< classBodyDeclaration < modifier >* < memberDeclaration < methodDeclaration > > >”, which looks for methods within a class.
So, I decided to rewrite the engine the way it should have been done: using an NFA. It’s so far taken a week or so, but it turns that the pattern matcher is much cleaner, and likely faster. In addition, the output engine–which executes the code blocks in the pattern–is also much cleaner. I’ll try to see if I can combine the patterns together in one automaton.
I really should have known better than to approach the tree regular expression matching problem following what other people did, using a top-down recursive recognizer. Live and learn. Always follow a clean, clear theory instead of just hacking.
Due to my work on Piggy, I’m starting to do a thorough review of the literature on program transformation systems, how Piggy relates to prior research, and what improvements I can make to Piggy. Note, a good list to start from is in Wikipedia: ASF+SDF, CIL (for C), Coccinelle (for C), DMS, Fermat, Spoon (for Java), Stratego/XT, TXL). This is the first entry in the series, on Coccinelle, a system that modifies C source code.
No ifs, ands, or buts, news about Net Standard may be old and stale, but one thing still applies: As Steve Martin would say, “What the hell is that?” Believe it or not, I started writing this entry years ago, a few months after Net Standard first came out. I had hoped that if I just write about it, I’d somehow stumble across what the hell it is. But, I didn’t get anywhere because most people don’t know what the hell they’re talking about. Fast forward to now.
XPath (1, 2, 3) is a language for finding nodes in an XML tree, and has a long history in AST search. Maletic et al. (4) is probably the first paper on XPath used on ASTs, using Antlr. It was further researched and is now part of the OSS world (5). In 2014, Parr added to Antlr releases an XPath API to search Antlr-generated ASTs (6). Src-d uses XPath and an engine for “universal” ASTs (7).
One further refinement to Piggy is required before I make a release of the tool: a wrapper to get the tool under MSBuild. Like the Antlr4BuildTasks wrapper I forked from Antlr4cs, I want Piggy to work seamlessly during the build of a C# project that uses a native library. My plan is for C# projects to contain the Piggy templates required to generate the declarations for C# of the interface needed by the project. Required by the user would be a template for Piggy and C++ file for the Clang compiler. During a build, the Piggy tool would run and produce C# output in the build directory, compiled and linked with the project. So, instead of users writing the DllImport decls to work with a native library, just indicate what you want and let Piggy do the rest. The build tool would be released to NuGet, and would contain the Clang serializer, the Piggy tool, the assembly wrapper for the Clang serializer and Piggy, and all the build rules.
It never ceases to amaze me how people can write a huge API and never bother to document how to use it. But, it’s been that way for as long as I can remember, going back 35 years. In my latest adventures, I’ve been trying to compile, link, and run C# code dynamically using Roslyn for Piggy, my transformational system. If you’ve ever used Roslyn in C#, you’ve probably discovered that it can be such a pain in the arse to use because Microsoft gives doc for the API, does give some tutorials, but I can’t find a simple example for compiling, linking, and running C#. I don’t need to know all the details yet, just a starting point framework. Unfortunately, the solution is quite sensitive to whether you use Net Core or Net Framework.
In order to better support Piggy, which uses Antlr4, I’ve added a NuGet package called Antlr4BuildTasks. This package is a pared-down derivative of the excellent work of Sam HartwellAntlr4cs code generator package, and includes just the rules and code needed to do builds in MSBuild, Dotnet, or Visual Studio 2017 IDE–just no Antlr4 tool itself. This package decouples the build rules from the Antlr4 tool and runtime, so you can build Antlr programs using the latest Java-based Antlr tool and runtime release. To use this package, make a reference to this package as if you would to any NuGet package; make sure to also reference the Antlr4.Runtime.Standard package, install Java, the Java-based Antler tool, and set JAVA_HOME and Antlr4ToolPath. The tool works with Net Core, Net Framework, or Net Standard code, and on Windows or Linux.
With a bit of hacking for the last month or two, and I can finally see that I am making progress on Piggy, a new kind of p/invoke generator. Some might say “Why in the world are you wasting time writing a p/invoke generator? Aren’t there tools already that do this?” Well, yeah, there are other generators, but they all…how should I say…suck! I need a p/invoke generator for Campy, a compiler and runtime for C# for GPUs, which I am still working on, but had to place on the back burner to work on this. Campy uses LLVM and CUDA. Because these libraries are large and constantly changing, I have to have an automated way of handling new releases.
If you’ve been programming in C# for a while, at some point you found yourself needing to call C libraries. It isn’t often, but when you have to do it, it’s like pulling teeth. One option is to set up a C++/CLI interface; the other is a p/invoke interface to a DLL containing the C code. It’s relatively easy to set up a p/invoke interface in your C# code for the C code, which you export with a DLL–if you only need to call a few C functions. But, if the API is large, you stare at the code for a while, deciding whether it is really worth writing out all the declarations you need to make the calls. Many people throw caution to the wind, write packages for large, popular C APIs so you don’t have to, which you can find on the Nuget website. One example is ManagedCuda, an API for CUDA programming from C#. Unfortunately, people get tired of trying to keep these packages up to date, and so these packages become obsolete. Another approach is through automatic means, whereby a tool reads the C++ headers (or DLL) and output the decls you call. A p/invoke generator reads C header files and outputs C# code with the p/invoke declarations that you can include in your code. These tools sometimes work, but often they don’t.
This blog entry is a “heads up” note about my thoughts for a new type of p/invoke generator. Continue reading →