An Information Hiding Proposal for ECMAScript

December 10, 2010

in JavaScript,Web Standards

One of the goals for ECMAScript Harmony, the project to define the next versions of the JavaScript standard, is to make JavaScript a better language for writing complex application.  Better support for object-oriented encapsulation, information hiding, and abstraction should help JavaScript programmer deal with such applications.

Today, I’m going to talk specifically about a proposal for better information hiding in JavaScript.  As I use the term, information hiding means that an object should clearly separate its stable public interface from its private implementation details. Properties that exist solely as implementation details should be hidden from “users” of that object.

JavaScript today does not provide much support for information hiding.  All properties have public string names and there is no language mechanism to tag  public or private properties.  Some programmers use naming conventions as a very weak form of information hiding.  Programmers looking for a stronger form often abandon the use of implementation properties (and prototype inheritance) and use closure capture to represent implementation state.

When evolving a language to address some specific program, it is natural to look at how other languages approach that problem.  Many people are familiar with information hiding in C++ or Java and may assume that JavaScript should implement information hiding in that same familiar way. In those languages “private” is an attribute of a member (field or method) of a class.  It means that a private member is only accessible to other members of the same class. The fact that member definitions are encapsulated together as a class definitions is key to this approach to information hiding.

Java-like information hiding is not a particularly good match to the JavaScript object model where the structure of an object is much more dynamic and  methods functions can be dynamically associated or disassociated with an object and shared by many different kinds of objects.

Let’s look at a different approach to information hiding that may be a better fit to JavaScript. In this approach “private” is an attribute of a property name, rather than of an actual property.  Only code that knows a “private name” can use that name to create or access a property of any object. It is knowledge of the name that is controlled rather than accessibility to the property.

Let’s look at how this might look in code.  First assume that a private declaration creates a unique private name and associates it with a local identifier that provides access to that unique name:

private implDetail;

The local identifier can then be used to create object properties whose “key” is that unique private name and also to reference such properties:

function MyObj() {
   private implDetail;
   this.implDetail=42;
   this.answer = function() {return this.implDetail};
}

Remember that the name of the property created above is not actually the string "implDetail" but instead it is a unique value that was created by the private declaration. The identifier implDetail is just a lexically scoped handle that is used to access that private name value. So:

var obj=new MyObj();
alert(obj["implDetail");//"undefined"
alert(obj.implDetail);  //"undefined"
alert(obj.answer());  //"42"

For the first alert, the message is "undefined" because obj does not have a property with the sting name "implDetail".  Instead it has a property with a private name that does not match that string. In the second alert the statement is not within the lexical scope of the private declaration  (within the constructor MyObj) so the property identifier implDetail does not correspond to the private name.  Finally, the third alert calls a method which does have access to the private name and it can return the value of the property.

This is a very powerful mechanism for adding information hiding to the JavaScript object model.  By following various usage patterns it is possible to implement the equivalent of both instance private and “class” private properties and also the equivalent of C++ friends visibility.  It can also be used to make extensions to built-in objects such as Object.prototype in a manner that is guaranteed not to collide with independently developed code that might try to make similar extensions.

A complete strawman proposal for this style of information hiding has been prepared for EcmaScript Harmony. It covers many more the technical details and provides many more complete examples of how the feature could be used.  Take a look and let me know what you think.  We’ll be discussing it in the standards committee but I’d like to get feedback from a broader range of JavaScript programmers. Does this proposal address your information hiding needs?  Does it fit well with the language as you use it?

{ 26 comments }

Peter van der Zee December 12, 2010 at 12:53 am

When it comes to languages like C or Java, information hiding is something that’s (up to a certain point) useful. It allows it to expose a certain API and hide other information from the outside world. Only those with debuggers can access it.

In JavaScript, that’s not really the case. Everybody has a debugger. There’s virtually no way of hiding information in js, not even by closures or all that. So introducing the concept of public/private seems to be more a case of annotation rather than mechanics.

And let’s say you do have the strong mechanism you propose. And say you release some lib with an API using that mechanism. Now somebody wants to use your lib and during development decides he really needs to access a property/method you deemed private. What’s gonna stop him from removing/changing the `private` keyword? He’ll have your source local (even if it’s minified, although that would make it slightly more difficult).

In that light, introducing a new property that “magically” would appear on `this`, but not on the object `this` would represent, seems a little weird to me. I’m afraid it would introduce confusion and inconsistencies (“when I alert it here, it exists, but now it’s gone, wth?”).

However, I do understand that there’s a need to at least be able to at least annotate and indicate a public/private relationship. So couldn’t the keywords be introduced to accommodate a meta level of information? By that I mean, introduce the public/private keyword prefix which has no side effects? An IDE or browser could then use it to warn the programmer about abusing it, it detected so. An IDE could use the meta information to properly build up a public API and what not.

var F = function(){
private this.moo;
private this.foo = 5;
public this.bar = “hi”;
};
new F().foo; –> “Warning, accessing private property from a public context”

Well, or something like that :)

- peter

allen December 12, 2010 at 1:48 pm

Peter,
It’s interesting that you refer to this proposal as a “strong mechanism” as I know that there are others who would criticize it for being being too “weak” of an information hiding mechanism. One of the challenges for language designers is to find a good balance between constituencies that have different requirements or points of view. A designer is unlikely to give everybody everything they want but a successful design should leave all consistencies feeling that the language is better than it was before.

There is a constituency that believes that more powerful language mechanisms like this are necessary to enable JavaScript to be a good language for constructing “large” applications and there is a lot of evidence that such mechanisms do help programmers management the complexity that is inherent in large programs. However, such mechanisms are never intended to make guarantees that are immune to source code modifications. In practice that doesn’t seem to have made such features irrelevant.

Pre-deployment based tools (ie IDE-based) have a place but they also have many down-sides. One is that some feel that only features that are built into the core language will be taken seriously and widely used. A related issue is the standardization of the meaning of such annotations. What happens if one IDE interprets a private annotation in the way you hand in mind for you example and another IDE interprets private in a different way. Which style of private should a framework writer use in their code? They are going to want their framework to work with any IDE. Most likely the decision they would make is to not use any such annotations at all.

Allen

Peter van der Zee December 13, 2010 at 12:53 am

Strong may have been an over-statement, I merely meant “one set in stone”. I’m sure there are plenty of holes but I’m also sure that they’ll mostly be fixed before ever reaching the spec :)

I see your point about IDE’s but it still feels like we’d be enforcing rules on what code can and can’t be used where. Maybe it’s just me though. I never liked it when code forced me to use or not to use certain snippets from a certain context. I mean, I like public/private to use as a tool of indicating an API. If you want to use private methods then fine, but you’re on your own. (I’ll probably get some flack for this :)

Anyways, I would just prefer to leave the semantics of public/private out in the open. Just a tool, not an enforced restriction.

Dave Herman December 12, 2010 at 8:35 am

@Peter:

“In JavaScript, that’s not really the case. Everybody has a debugger. There’s virtually no way of hiding information in js, not even by closures or all that. So introducing the concept of public/private seems to be more a case of annotation rather than mechanics.”

Huh? How do you suggest programmers get access to closed-over variables in JS?

Dave

Peter van der Zee December 13, 2010 at 12:58 am

From the browser, you can use inspectors and debuggers to read the value, out of the box. To change them from within the browser, I guess that’s not so easy (not sure if at all possible at the moment).

If you serve the code yourself, what’s gonna stop anybody from editing the source of the code that contains the “private” variable you want access to and add a handle to it?

I’m not so much worried that people will think js can safely hide information from the user, but I am saying that it can’t. And in that light, enforcing the concept of public/private just seems a redundant effort.

Colby Russell December 21, 2010 at 9:32 pm

If you serve the code yourself, what’s gonna stop anybody from editing the source of the code that contains the “private” variable you want access to and add a handle to it?

The same thing that stops people from doing this in C++ and Java: nothing. But that’s all inconsequential.

Asen Bozhilov December 12, 2010 at 8:55 am

I am not sure that will improve libraries design. If I need access to these private members I should to define “privileged” methods. That worry me is on every instance, will be allocated memory for these “privileged” methods. In this case I would prefer naming convention to “hidden” information.

You said that “implDetail” is lexical scoped at the function in which is defined. My question is regarding:

this.answer = function() {return this.implDetail};

How would be resolve the value of `implDetail’ when I called this function? And the more interesting question for me is. If I call this method with different value for `this` how would be resolved the value of `this.implDetail;`?

var obj=new MyObj();
obj.answer.call({});

Keep forward with this blog, it is really good for every Javascript developer!

allen December 12, 2010 at 2:21 pm

Methods with “private names” as describe in the proposal can be shared via prototype inheritance just like any other property. So, there isn’t necessarily anymore per instance over-head than there is with normal properties. Of course, such methods would be “class” private rather than instance private but that is probably but if you are concerned about per instance overhead that is probably what you want.

Regarding you this.answer question. Let’s assume that the definition of MyObj is this:

function MyObj () {
private implDetail;
this.implDetail=42;
this.answer= function() {return this.implDetail}
}

Then the function that is assigned to the property answer lexically captures the private name binding of implDetail that was in scope when the constructor is called. All subsequent calls of that function object are going to use that specific unique private name value.

If you called the function with a different object as its this value, it still uses the original private name value as the key for looking up the implDetail property. If the object doesn't have a property whose name is that same unique private name value the result of the property access will be undefined. That's how you can control whether a property is instance private or "class private". Does each instance use a distinct private name value or do all instances share the same private name value.

In the definition above, each instance gets a unique private name so:

var obj1=new MyObj();
var obj2=new MyObj();
var answer1=obj1.answer;
print(answer1.call(obj1)); //42 because the answer1 method knows obj1's private name for implDetail
print(answer1.call(obj2)); //undefined because obj2 uses a different private name for implDetail

If the private implDetail; was moved outside of the MyObj function definition then the second call of answer1 would also return 42;

Asen Bozhilov December 13, 2010 at 5:36 am

Thank you for your valuable answer. By your last example in the previous post it seems we will have *unexpected* behaviour in inheritance pattern. Let assume we have “class” definition:

var Class1 = (function () {
private hiddenField;
function Construct() {}
Construct.prototype = {
hiddenField : 10,
answer : function () {
return this.hiddenField;
}
};
return Construct;
})();

var Class2 = function (){};
Class2.prototype = new Class1;
Class2.prototype.answer = function () {
return Class1.prototype.answer.call(this); // undefined ?
};

For that reason was my previous question for the `this` value. If in this case it returns `undefined’, this is not very good for regular JS programmer unless if he tries to create non-generic methods. Am I correct?

allen December 13, 2010 at 7:51 am

You would get 10!

Let’s say we do the following:

var aClass2 = new Class2;
print(aClass2.answer());

The call from the print function is going to invoke the answer method defined in Class2.prototype. The body of that function is going to execute as if it was:

return Class1.prototype.answer.call(aClass2); //this was bound to aClass2 value

When Class1.prototype.answer is executed by the above call, its body is going to execute as if it was:

return aClass2.hiddenField; //this is still bound to aClass2 value, hiddenField is private from Class1

That hiddenField access starts with aClass2 and searches the prototype chain until it find the definition of hiddenField in Class1.prototype and returns the value 10.

So it all works just like someone would expect inheritance to work. However, if you want to over-ride hiddenField in a “subclass” you have to have access to it private name. Anybody can over-ride answer and define a new one that with a different implementation perhaps using a different private backing property.

alexis coudeyras December 12, 2010 at 9:13 am

There is a big difference between your proposition (same could be said about Crockford famous pattern) and Java : in Java private is just a convenient way (enforced by IDE) to tell the client developper that it *should* not use this variable/method. *Should* and not *could*, because he can access it, using reflection and it could be very usefull to manipulate object internal state for unit testing for example (avoiding complex setup methods).
So i prefer prefixing private/protected properties with _ because i think it’s closer to the “spirit” of OO private keyword (the client developper has the information that it should not use it, but he still can).

allen December 12, 2010 at 2:27 pm

In Java, “private” is enforced by the Java compiler and jvm. It isn’t just an IDE feature. It is only via reflection that Java “private” can be circumvented at runtime. The reflection loophole also exists for my proposal so I don’t really see what distinction you are making between the two.

Tony Garnock-Jones December 12, 2010 at 10:30 am

Hi Allen,

In the strawman, you write “it is likely that properties defined using private names should not show up in for-in enumerations”. Could you expand upon that a little? It seems to me that it would be more useful for private-named properties to behave as similarly to regularly-named properties as possible, so I’d like to understand why you think it’d be better to make them non-enumerable.

Regards,
Tony

allen December 12, 2010 at 2:35 pm

It’s all a matter of what you believe is the the role of for-in enumeration and the enumerable attribute of properties. My assumption is that for-in is intended to enumerate over the “public data” of an object (which is why built-in methods and things like array length are not enumerable). It seems like a reasonable assumption that if you are using a private name for a property that you don’t intend it to be part of the objects “public data “. If this isn’t the cause, you can always use Object.defineProperty to force the private named property to be enumerable.

Over all I agree you about minimizing difference between regular and private named properties. Some people have argued that that private named properties should never be enumerable. What is in the proposal seems like a reasonable middle ground.

FremyCompany December 12, 2010 at 12:05 pm

Well, I would say it’s not completely true you can’t have private variables in JScript today.

function myClass() {
var privateProperty = undefined;
Object.getPrivateProperty = function() { if (/* arguments.caller returns a function defined in the object constructor */) { return privateProperty; } else { throw new Error(‘Private member’); });
}

myClass.prototype={ patati patata };
Object.freeze(myClass.prototype);
Object.freeze(myClass);

The only problem is that be behavior of arguments.caller was left undefined in the ECMAScript 5 specification. It makes it impossible to perform such test reliably. Some browser even returns (arguments.caller=null) when the called function is a property accessor (if it would be working and returning the true calling function, we could have replaced getPrivateProperty by a property accessor, which would be a real private property).

allen December 12, 2010 at 2:41 pm

ES5 did more than leave arguments.caller undefined. For strict mode it totally banned it by making it a poison-pill property (it always throws when used with strict mode functions).

Regardless, the intent of this proposal is to provide a straightforward way for JavaScript programmer to employ information hiding in their code. I don’t think anything that depends upon using arguments.caller meets that criteria.

FremyCompany December 14, 2010 at 2:17 am

Yeah, I’ve seen that, too. I didn’t clearly understand the reason of this choice (of killing arguments.caller), but, well, now it’s done, it’s done. Scriptum scriptum.

I don’t think is an intelligent design because if you want to implement “private” properties, you’ll need to know who is the calling function (and, even more, what’s the ‘this’ value currently used in that function). This mean you need to have the arguments.caller value. Or did I miss something ?

If the only goal is to hide the property for any other reference than ‘this’ (ie ‘obj.privateProp’ would fail but ‘this.privateProp’ would succeed in every case), it’s not worth implementing, because we could do (function() { return this.privateProp; }).call(obj); to retreive a private property value from any “obj” object.

For me, the only functions that should have access to this.privateProp are the functions (or properrty accessors) defined in this.constructor.prototoype. If there’s no way to make sure the property is not accessible (using a more or less complex trick) even when this.constructor.prototype and it’s chain is frozen, adding fake private properties is not worth the change, since an ‘hacker’ would still be able to read the value using user-code (maybe more complex than usual, but still).

Anyway, we need a formal proposal to see if the proposal works or not, it’s difficult to speak ad nauseum because we’re, a-priori, assuming objectifves and behaviors which may not reflect what the author intended.

allen December 14, 2010 at 8:24 am

The (relatively) formal proposal is at http://wiki.ecmascript.org/doku.php?id=strawman:private_names.

You seem to be assuming a particular property access control model based upon the identify of the requester of a property value. The private names proposal isn’t trying to implement that specific model nor really any specific access control model. Instead it is trying to provide a primitive mechanism for the language that can be used for various property access control use cases. The example in the original post is just one simple use case example. The full proposal has more.

FremyCompany December 16, 2010 at 12:33 pm

Thanks for the link. It’s an interesting way to handle the problem. This starts from a completely different view point, which I expected when I read your last answer. It’s great, but I think you should not speak about access modifier, because it’s not what this proposal does, even if it has a similar purpose (but can have more purposes than just member hiding, as shown on the wiki webpage).

Basicly, it’s only a new way to generate particular property names (at this time, we could do nearly the same thing in ES5 by generating very complex random-based property names, and mark the property as non-enumerable, we just would need to use obj[propertyName] instead of obj.propertyName). It’s funny because it’s exactly what I did in a recent code I’ve made to see what could be done about private properties for JScript :-)

Reading the proposition, I’ve got some remarks, though. In the samples, I think “print(thing1.hasKey(thing1)); // true” should be “print(thing1.hasKey(key)); // true” instead. Right ?

My last question, and maybe the best one:
You’ve two objects form different types, who both have a ‘private’ property called ‘key’. If we want to access both properties, from function A() {}, we need to define two new aliases (key1 for the first class, and key2 for the second class). It is somewhat confusing to have a property changing its name because another object has the same ‘code’ property name, but another ‘randomly generated’ name. I mean, the property name becoming a sort of variable, we can only have one (private) property sharing the same name at the same time in one context, while we could have as many as we want now. I don’t know if there’s a way to solve that, but maybe it’s worth thinking about it… Maybe linking the private ‘key’ to a prototype could solve the problem (we could have private key = [[MyProto1, key1], [MyProto2, key2]]) but it would need some logic to find out which prototype is applicable in the current case).

allen December 20, 2010 at 10:35 am

In the samples, I think “print(thing1.hasKey(thing1)); // true” should be “print(thing1.hasKey(key)); // true” instead. Right ?

No, the example is correct as written. However, the haskey method is perhaps misnamed in a way that creates confusion. hasKey actually checks if the value of the “private” key property of this object is the same as the value of the same “private” key value as of the argument object. That requires that both objects are defined using the same private name for the property. That is why this example is being used to illustrate the difference between “instance private” and “class private” properties.

Regarding your last question, I think having to use two different local private identifiers is probably a good thing. Situation like this seem most likely to occur within code that is trying to integrate independently two or more independently written codes in some manner. If the code being integrated coincidentally used the same private identifier to refer to logically unrelated private named “friend” properties, then it is up to the integration code to make explicit which specific property it is referring to at each point. Using different local private identifier is a way to make this explicit.

Charlie Robbins December 12, 2010 at 12:32 pm

Allen,

I honestly think the last thing we need is access modifiers. Yes, sometimes it is impossible to get to a closed variable, but that’s really just in certain implementations of functionality. In each of those cases it is (most likely) possible to reimplement the same functionality without any such issues.

To me, really access modifiers are just hubris of the programmer. How do you know in some esoteric or novel usage scenario out in the wild that some programmer won’t need access to ‘implDetail’? I much prefer that information be hidden from me “as a suggestion that I probably shouldn’t mess with it unless I know what I’m doing”. In these cases I use the “this._implDetail” notation to indicate such information.

Just my two cents. Interesting take on where we could go in Harmony.

allen December 12, 2010 at 2:45 pm

I actually agree that the use of access modifiers are somewhat over-rated. However, many people think otherwise. One of the goals of this proposal was to try to find a middle ground that provides a reason mechanism for information hiding (for those who demand it) without doing fundamental damage to the core JavaScript object model.

Anentropic December 16, 2010 at 3:42 am

Please, please, please… don’t do this.

This satirical article is one of the best explanations I’ve seen as to why not:
http://steve-yegge.blogspot.com/2010/07/wikileaks-to-leak-5000-open-source-java.html

Can’t we just follow the Python model for ‘private’ properties, i.e. convention over legislation.

Benjamin Smedberg December 22, 2010 at 5:58 am

I really don’t think we need a new language syntax for these private properties. Why can’t we instead have a new “key type” which can be hidden? e.g.

var kImplDetail = new Property();

var o = {};
o[kImplDetail] = value;

Perhaps a little bit of sugar to make object literals easier to write would be helpful:

var o = {
(kImplLiteral): value
};

If you want to share access to the kImplLiteral property, all you have to do is hand out the kImplLiteral property object, and you can fall back gracefully in environments which haven’t implemented private properties, because there isn’t a new syntax.

allen December 22, 2010 at 8:50 am

This is certainly a reasonable alternative to consider. It precludes the possibility of using dot qualification to access private properties:

o.implDetail = value; //can't say this

Perhaps more significantly, it requires new syntax to include them in object literals. Parenthesis in your example.

The main factor in considering this and other alternative syntax proposal for this functionality is which will be easiest to for JavaScript programmers to learn.

Jeff Walden January 10, 2011 at 11:26 am

I can’t resist: can we use [[implDetail]] for new syntax in object literals? Okay, probably a bad idea. But I had to at least mention it. ;-)

Previous post:

Next post: