Hadley Wickham <hadley at rice.edu>
Wouldn't it be better to inspect the evaluation environment for the
traces of the evaluation and then dispatch on the objects discovered?
Then the code
aaa <- local({ ... compute object ..})
will correctly dispatch on aaa and won't be ignored.
HW> It does do that - but you need to parse the call because there are
HW> number of calls that produce global side effects (e.g. creating
HW> classes and methods). You could try doing it after the fact (e.g. by
HW> using S4 introspection to find all the objects to document), but then
HW> it's much more difficult to match the documentation block with the
HW> object.
Not that difficult, S4 always leave traces in the evaluation
environment, objects starting with:
methods:::.TableMetaPattern()
[1] "^[.]__T__"
methods:::.ClassMetaPattern()
[1] "^[.]__C__"
Inspecting those, you know exactly what was installed.
S3 methods are left in ".__S3MethodsTable__." object locally. So no
trouble at all. I have done this before, and can provide the necessary
code.
It looks like an internal hackery, but it's really not. This
implementation will hardly ever change, and if changes, will be easy to
adapt.
It will be possible to create documentation for a bunch of objects at
the same time. For example
local({ a <- generate_object_a()
b <- generate_object_b()})
will document both a and b.
HW> Could you flesh out this example a bit more? I don't understand why
HW> you'd want to document objects that aren't evaluated by the user.
Ah sorry, that was stupid. I meant
eval({ a <- generate_object_a()
b <- generate_object_b()})
Roxygen can make a convention if two declarations are followed
imidiately after each over they souled be documented in the same
roxy-doc and same Rd file:
foo <- function(a) ..
boo <- function(a) ..
will put both foo and boo in the same file. Curently one needs two
documentation blocks and rdname tag if I am not mistaken.
Post by Hadley WickhamI don't see the need for every tag to be class aware just a few.
This is a complication. You ending up in implementing your own OO
#' The roccer object is a key component in roxygen3 - it defines the behaviour
#' of a tag with a \code{parser} and a \code{output} write.
Why would you need an roccer if you already have "classes" and "methods"
to define the behavior of the tag?
HW> I'm not sure I get your point - the roccer _is_ the object that
HW> represents the tag.
You meant roccer as an abstract encapsulation of the behavior of the
tag. That is
structure(list(name = name, parser = parser, output = output), class = "roccer")
An abstract notion of a "tag" has a representation with a name and two
methods which describe the behavior of a tag. This is precisely the task
of a class/method system.
add_rosser and roccer functions are basically a replacement for setClass
and setMethod.
In S4 instead of roccer + add_rocer + basic_roccer + etc you might do:
setClass("RoxyTag", list(name = "character"))
setGeneric("roxyParse", function(tag) NULL)
setGeneric("roxyRd", function(tag) NULL)
For every tag:
setClass("RoxyFamily", contanins = "RoxyTag")
setMethod("roxyParse", signature = "RoxyFamily",
def = ... )
instead of
parse_family <- function() ...
roc_family <- roccer("family",
roc_parser(tag = text_tag(), all = parse_family))
It looks like you want a simple interface, but it ends up being a cross
between S3 and internal roxygen object (i.e. tag) keeping system. Sort
of _roxyClasses_ approach, and it looks like you haven't yet get to the
inheritance and extension mechanism.
(Actually, I think I started understanding why you proposed to split the
package and to keep tags separately. To simplify the extension of
tags(that is it?). If the tags are S4 classes, then this is not a
problem. Any package can extend the system!)
To wrap up, quite some of the current code is essentially an OO keeping,
and can be completely eliminated by delegating the work to S4.
You can just have a virtual S4 class "roxy_tag". Then subclass
"roxy_tag_oxygen" and have all other tags derive from that. Most of them
will probably have only two slots, "name" and "text", but some like
"slots" tag will have more.
Then you can have "roxy_split_oxigen(object, doc)" generic dispatched on
object which would split the string 'doc' into tags. Each tag is an
object. Then another generic "roxy_parse(tag)" to actually parse the
tag. Another generic "roxy_rd(tag)" to generate rd entry, and yet
another generic "roxy_template(tag)" to generate template. And so on.
HW> I think this is more inline with how roxygen2 works. You can't have a
HW> methods that just work with a single documentation block + object
HW> (rocblock for short) at a time, because some tags work more globally
HW> (e.g. @family, @include, @inheritParams). That's more of a comment on
HW> your suggested function names rather than using S4 - but it does have
HW> a big impact on the API.
I didn't think about that. I barely understand how roxygen works as
yet:). You have rockblock_parser class already. I guess it's just a
question of S3 or S4 then.
Actually, the parsing generic (roxyParse or whatever) can by default
take two arguments, the object and the whole bunch of rocblocks. Each
tag will decide for itself whether to use rockblocks argument or not.
I guess you are already doing something similar, but I am a bit confused
of why the distinction between parse_rocblocks and roccer$parser is
necessary.
The end user can getClass("roxy_tag") to see all the tags which
are available. Same applies to methods.
HW> Hmmm, that would be nice.
Especially if other packages extend the tag system :)
All of this looks simple, consistent and transparent to me. In order to
extend roxygen, one would not need to dig into the code and learn how it
works and try to find workarounds to implement features which are not
there. But, instead, just start writing methods and classes directly.
HW> I think there are two issues at play:
HW> * whether to use S3 or S4
HW> * the design of the object system
HW> I think the object system can definitely be improved (and your
HW> discussion is really helpful), but I'm not convinced that using S4
HW> over S3 brings enough advantages to make it worthwhile. It would be
HW> very useful if you could lay out what the main advantages of S4 in
HW> your mind are in this situation. That would help me think it through.
HW> (Two advantages: it would force me to make roxygen S4 support a lot
HW> better, and would force me to use S4 on a larger project ;)
HW> My feeling is that generally R users are more familiar and comfortable
HW> with S3 rather than S4. So it might make it less likely to get
HW> contributions.
Now roxy users have to learn roxyClasses system ;). And by building new
packages on S3 you actually contributing to rooting and roting of
S3. It's surprising why people are so stuck with it. S4 is so simple;
there are only two main functions setClass, setMethod. Nobody needs to
know more.
I can hardly add anything new to well known S4 advantages. Here are a
couple of obvious thoughts:
- S4 is R standard and R-core encourages using it. S3 is virtually
subsumed to S4 right now.
- Extension across packages is completely handled in the background,
and it takes a huge load of your shoulders. There are myriads of
classes and objects which people might want to document in a
different way: RefClasses, Rjava, Cpp, proto, etc. Roxygen should
not care about them. Each package should define it's own tags,
parsers, Rd converters etc.
- S4 is actually a good system -- multiple inheritance and multiple
dispatch. Thing which is implemented only by a handful of languages.
- Roxygen might need multiple dispatch/inheritance, even if it is not
apparent right now.
- Type checks are done automatically.
- Building new tags on top of others is easy. Inheriting the behavior
is automatic.
- Conversion between current system of roccers to S4 is trivial. You
handle almost everything as a list structures which will become S4
objects with slots.
- Explaining how to extend roxygen won't take more than half a page:
Reading source file -> Spliting in rocblocks -> Parsing with
roxyParse -> Conveting to Rd with roxyRd. Please write your
roxyParse and roxyRd methods.
- People will finally learn some S4 :)
There must be more, but it's getting too late.
Post by Hadley WickhamCan you give an example? Generally, if you can automatically generate
the template, why can't you automatically generate the Rd directly?
What I meant here is the following. Suppose you have a function
foo <- function(a = 4, b = 34){
a + b
}
To start documenting the function, a user might want to insert a
skeleton for a documentation (template) like following
##' ..description
##'
Depending on the editor, this might be bound to a key.
HW> Ah, I see. And you see this being the role of roxygen to generate,
HW> rather than the editor?
Right, but if one editor have already done that (ESS for example) why
not to reuse the code in an editor independent way?
HW> It seems like there's a lot that people could disagree on (##' vs
HW> #', do you need @author) etc.
It could be customized in options$roxygen or alike. If roxygen
establishes an uniform customization interface, then editors will be
forced to pick it up from there.
There might be also a roxy_update(OBJECT, OLD_TEXT) method which would
take OLD_TEXT and output a modified version of it to account for changes
in OBJECT. For example, if you have documented parameters a and b above,
and then decided to rename b into c, then roxy_update will just change
@param b into @param c. An editor can bind this to a key.
HW> I think that's a nice idea, but a lot of work!
I would leave that to users and editors. ESS does the updating well for
functions, and the functionality will come pretty fast for other
objects, once the proper interface is in place.
Vitalie.