The JADE 2022 release meets tomorrow’s business demands.
When calls from Jade out to external functions written by ourselves or 3rd parties crash, they take out the node.
Matters are worse still if the logic is running server-side as that will crash the whole environment (jadrap).
The Developer Security Library is an example of a call that runs as a serverExecution and this has previously taken out our main live environment.
Is there any way the Jade product can be improved to be more resilient when calling out to libraries?
Perhaps arm an exception handler to help minimise the fallout?
Any solution would ideally be made in-process, or at least we would need to preserve the current ability to call back into the current Jade process.
This idea is complimentary to the DSL hooks idea previously posted.
After careful consideration we have decided to not proceed with providing a generalized "no crash" handler for any system exception since it has the potential to result in undetected data corruption.
The product already catches structured exceptions triggered in external function calls or third-party libraries. Our preferred action is to take a process dump (unless this is disabled by config) and to terminate. Node termination mitigates the risk of propagating possible in memory corruption that could compromise data integrity.
With reference to the example scenario covered by https://jedi.ideas.jadeworld.com/ideas/JAD-I-402. Our preference is to provide an alternative source control interface with client-side callbacks that could be used in place of the DSL.
We are also prepared to consider targeted fault resilience on a case by case basis. PAR 67257 alluded to in the comments is an example.
When a par is patched, the reporting client receives an email. That par was fixed within days of you reporting it, and nobody has ever asked for a hotfix. Also, this is not related to 3rd party code, it is related to general resiliance. Unfortunately, again, when this kind of exception occurs, it is impossible to know that no serious damage has been done by the misbehaving code. If we caught things like this and let the process run, you might find that your methods return strange results or odd data gets written to the database. We are open to ideas. How about, we run every JADE application in a separate process like the tabs in Chrome?
An example of the internal areas/issues that the wider JADE product should be protected against is the auto-complete functionality. We've been waiting months for a fix to PAR #67257 to be delivered to prevent the development environment crashing when we're looking at a particular kind of code that the auto-complete functionality can't handle. It can be quite disruptive when this happens while we figure out why our development environment has frozen up, but the alternative is worse (don't use auto-complete at all). There's always going to be edge cases it can't handle properly, especially when new language features are introduced. Ideally, this is an area where the JADE product would safeguarded in a way that avoids crashing our thin-clients entirely, perhaps with a simple warning message in the status line indicating auto-complete isn't working.
If the dll exceptions then I appreciate the Plants decision to crash the node immediately would help expose the problem as well as minimise the impact of further damage.
Its pessimistic v optimistic however as not all errors will overwrite jade’s memory, so some customers may prefer the risks associated with limping over no access at all, at least in a triage situation. This is a wider question that could be asked of some of jades internal assert’s as well, as to whether there’s an alternative ‘serious error’ approach.
Writing our own exception handler is something we’ve previously discussed, which is fine, but the JEDI is primarily about the possibilities of making the product more resilient.
I've been wondering about how to sandbox untrusted code for a while now. I'm glad this has come up, as this will be a hard one to sell. In your example, you could of course have had an exception handler of your own. In standard enternal function calls, we do have an exception handler which terminates the node if it activates. This is deliberate as the untrusted code could have scribbled over any and all memory in the windows process. I'm not sure off the top of my head about calls to the DSL, and external method calls are probably protected, too.
What do you mean "call back into the current JADE process?"