Discussion:
[Roxygen-devel] roxygen3
Hadley Wickham
2012-08-27 23:53:55 UTC
Permalink
Hi all,

I thought I should mention I've started working on roxygen3 at
https://github.com/hadley/roxygen3. It's a ground up rewrite of
roxygen2, aiming to produce the same results as roxygen3, but without
a completely different backend. Currently the roxygen2 code is hard
to extend - the idea of roclets was good, but I think they were too
big - you want to be able to work at the tag level so it's easier to
add new features.

The new roccers in roxygen3 look something like this:

parse_dev <- function(roc, ...) {
if (is.null(roc$dev)) return()
list(
title = str_c("[DEV] ", roc$title),
description = c("This function is useful only for developers",
roc$description),
dev = NULL)
}

add_roccer("dev", roc_parser(one = parse_dev))
base_prereqs[["dev"]] <- c("_intro", "title", "details")

That defines @dev which adds a note to the title and the description
that the function is more suitable for developers than end-users.

The interface is likely to change a lot, and I doubt the package
currently installs (although it does work with devtools::load_all),
but if you're interested please take a look. I'm hoping a stronger
foundation will make it easier to keep on top of the bugs and to
flexibly implement new features.

Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Brian G. Peterson
2012-08-28 00:01:07 UTC
Permalink
Post by Hadley Wickham
I thought I should mention I've started working on roxygen3 at
https://github.com/hadley/roxygen3. It's a ground up rewrite of
roxygen2, aiming to produce the same results as roxygen3, but without
a completely different backend. Currently the roxygen2 code is hard
to extend - the idea of roclets was good, but I think they were too
big - you want to be able to work at the tag level so it's easier to
add new features.
I'm thrilled to her that you are working to improve the roxygen backend.

I use roxygen(now 2, ick) to document over ten packages, with more in
development, and it's a huge productivity enhancer. Thanks for taking
it over and maintaining it.

I'm *not* thrilled that you intend to release another package instead of
just using version numbers, as has been the best practice in software
development for decades.

Could you *please* talk to some actual computer scientists and at least
consider calling it roxygen version 3.0 or roxygen2 version 3.0 instead
of releasing a new package name every time you feel the need to refactor
the code? It is commonly understood that major version numbers may
break backwards compatibility... no need to needlessly break the package
name too.

Regards,

- Brian
--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
Hadley Wickham
2012-08-28 15:16:31 UTC
Permalink
Post by Brian G. Peterson
Could you *please* talk to some actual computer scientists and at least
consider calling it roxygen version 3.0 or roxygen2 version 3.0 instead of
releasing a new package name every time you feel the need to refactor the
code? It is commonly understood that major version numbers may break
backwards compatibility... no need to needlessly break the package name too.
I don't think you understand the reality of the R package management
system. Most people will run update.packages() and get new versions of
all packages. If a package has API breaking changes then it will cause
considerable frustration, especially given how difficult it is to
install a previous version of a package.

Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Vitalie Spinu
2012-08-28 15:53:28 UTC
Permalink
Post by Hadley Wickham
Hadley Wickham <hadley at rice.edu>
Could you *please* talk to some actual computer scientists and at least
consider calling it roxygen version 3.0 or roxygen2 version 3.0 instead of
releasing a new package name every time you feel the need to refactor the
code? It is commonly understood that major version numbers may break
backwards compatibility... no need to needlessly break the package name too.
I don't think you understand the reality of the R package management
system. Most people will run update.packages() and get new versions of
all packages. If a package has API breaking changes then it will cause
considerable frustration, especially given how difficult it is to
install a previous version of a package.
May be it's time to revert to "roxygen" name?

Roxygen is not required by any other package; so no dependence
problems. Also if user interface is not changed, and only internals are
refactored, then there is no need for a new name, is it?

Another option would be to release a new "old" package like
"roxygen_old", "ggplot_old" etc. Then people can just use the old one
with minimal inconvenience. The problem with this, is that there might
be papers/books published with the old command interface, but people
understand that, so not a big deal anyways.

Vitalie
Hadley Wickham
2012-08-28 16:07:18 UTC
Permalink
Post by Vitalie Spinu
May be it's time to revert to "roxygen" name?
Roxygen3 may well revert to roxygen2, but I don't currently want to
tie myself to non-breaking changes. (e.g. I'm considering allow
markdown syntax in roxygen comments)
Post by Vitalie Spinu
Roxygen is not required by any other package; so no dependence
problems. Also if user interface is not changed, and only internals are
refactored, then there is no need for a new name, is it?
If there are breaking changes, and you have packages with old and new
systems, then you will still need both installed.
Post by Vitalie Spinu
Another option would be to release a new "old" package like
"roxygen_old", "ggplot_old" etc. Then people can just use the old one
with minimal inconvenience. The problem with this, is that there might
be papers/books published with the old command interface, but people
understand that, so not a big deal anyways.
But that doesn't resolve the problem of update.packages() breaking
code, which is the whole point of bumping the package name.

Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Andrew Redd
2012-08-28 15:13:08 UTC
Permalink
Hadley,
Great news. Will we be able to do the Doxygen style commenting of
function parameters that I have discussed with you earlier? Will that
be possible or should I still pursue doing that within lint?

Thanks,
Andrew Redd
Post by Hadley Wickham
Hi all,
I thought I should mention I've started working on roxygen3 at
https://github.com/hadley/roxygen3. It's a ground up rewrite of
roxygen2, aiming to produce the same results as roxygen3, but without
a completely different backend. Currently the roxygen2 code is hard
to extend - the idea of roclets was good, but I think they were too
big - you want to be able to work at the tag level so it's easier to
add new features.
parse_dev <- function(roc, ...) {
if (is.null(roc$dev)) return()
list(
title = str_c("[DEV] ", roc$title),
description = c("This function is useful only for developers",
roc$description),
dev = NULL)
}
add_roccer("dev", roc_parser(one = parse_dev))
base_prereqs[["dev"]] <- c("_intro", "title", "details")
that the function is more suitable for developers than end-users.
The interface is likely to change a lot, and I doubt the package
currently installs (although it does work with devtools::load_all),
but if you're interested please take a look. I'm hoping a stronger
foundation will make it easier to keep on top of the bugs and to
flexibly implement new features.
Hadley
Hadley Wickham
2012-08-28 15:17:54 UTC
Permalink
Hi Andrew,

The comments are still attached only to top-level expressions, so it
still won't work. I'm thinking about how to change that to deal with
(e.g.) reference classes, so it's possible I'll figure it out, but I
wouldn't hold your breath.

Hadley
Post by Andrew Redd
Hadley,
Great news. Will we be able to do the Doxygen style commenting of function
parameters that I have discussed with you earlier? Will that be possible or
should I still pursue doing that within lint?
Thanks,
Andrew Redd
Post by Hadley Wickham
Hi all,
I thought I should mention I've started working on roxygen3 at
https://github.com/hadley/roxygen3. It's a ground up rewrite of
roxygen2, aiming to produce the same results as roxygen3, but without
a completely different backend. Currently the roxygen2 code is hard
to extend - the idea of roclets was good, but I think they were too
big - you want to be able to work at the tag level so it's easier to
add new features.
parse_dev <- function(roc, ...) {
if (is.null(roc$dev)) return()
list(
title = str_c("[DEV] ", roc$title),
description = c("This function is useful only for developers",
roc$description),
dev = NULL)
}
add_roccer("dev", roc_parser(one = parse_dev))
base_prereqs[["dev"]] <- c("_intro", "title", "details")
that the function is more suitable for developers than end-users.
The interface is likely to change a lot, and I doubt the package
currently installs (although it does work with devtools::load_all),
but if you're interested please take a look. I'm hoping a stronger
foundation will make it easier to keep on top of the bugs and to
flexibly implement new features.
Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Andrew Redd
2012-08-28 15:26:27 UTC
Permalink
I would be happy to contribute an extractor function to get the relevant
comments. We would have a dependency on parser or R>2.16 if we use
Duncan's version that made it into the core-R. But we have all the
machinery to extract the information. Perhaps if we could form a
standard for organizing the information so that the roccers could handle it.

-Andrew
Post by Hadley Wickham
Hi Andrew,
The comments are still attached only to top-level expressions, so it
still won't work. I'm thinking about how to change that to deal with
(e.g.) reference classes, so it's possible I'll figure it out, but I
wouldn't hold your breath.
Hadley
Post by Andrew Redd
Hadley,
Great news. Will we be able to do the Doxygen style commenting of function
parameters that I have discussed with you earlier? Will that be possible or
should I still pursue doing that within lint?
Thanks,
Andrew Redd
Post by Hadley Wickham
Hi all,
I thought I should mention I've started working on roxygen3 at
https://github.com/hadley/roxygen3. It's a ground up rewrite of
roxygen2, aiming to produce the same results as roxygen3, but without
a completely different backend. Currently the roxygen2 code is hard
to extend - the idea of roclets was good, but I think they were too
big - you want to be able to work at the tag level so it's easier to
add new features.
parse_dev <- function(roc, ...) {
if (is.null(roc$dev)) return()
list(
title = str_c("[DEV] ", roc$title),
description = c("This function is useful only for developers",
roc$description),
dev = NULL)
}
add_roccer("dev", roc_parser(one = parse_dev))
base_prereqs[["dev"]] <- c("_intro", "title", "details")
that the function is more suitable for developers than end-users.
The interface is likely to change a lot, and I doubt the package
currently installs (although it does work with devtools::load_all),
but if you're interested please take a look. I'm hoping a stronger
foundation will make it easier to keep on top of the bugs and to
flexibly implement new features.
Hadley
Hadley Wickham
2012-08-28 15:29:33 UTC
Permalink
In that case, take a look at rocblock-parse.R (particularly
parse_text) and let me know what you come up with.

Hadley
Post by Andrew Redd
I would be happy to contribute an extractor function to get the relevant
comments. We would have a dependency on parser or R>2.16 if we use Duncan's
version that made it into the core-R. But we have all the machinery to
extract the information. Perhaps if we could form a standard for organizing
the information so that the roccers could handle it.
-Andrew
Post by Hadley Wickham
Hi Andrew,
The comments are still attached only to top-level expressions, so it
still won't work. I'm thinking about how to change that to deal with
(e.g.) reference classes, so it's possible I'll figure it out, but I
wouldn't hold your breath.
Hadley
Post by Andrew Redd
Hadley,
Great news. Will we be able to do the Doxygen style commenting of function
parameters that I have discussed with you earlier? Will that be possible or
should I still pursue doing that within lint?
Thanks,
Andrew Redd
Post by Hadley Wickham
Hi all,
I thought I should mention I've started working on roxygen3 at
https://github.com/hadley/roxygen3. It's a ground up rewrite of
roxygen2, aiming to produce the same results as roxygen3, but without
a completely different backend. Currently the roxygen2 code is hard
to extend - the idea of roclets was good, but I think they were too
big - you want to be able to work at the tag level so it's easier to
add new features.
parse_dev <- function(roc, ...) {
if (is.null(roc$dev)) return()
list(
title = str_c("[DEV] ", roc$title),
description = c("This function is useful only for developers",
roc$description),
dev = NULL)
}
add_roccer("dev", roc_parser(one = parse_dev))
base_prereqs[["dev"]] <- c("_intro", "title", "details")
that the function is more suitable for developers than end-users.
The interface is likely to change a lot, and I doubt the package
currently installs (although it does work with devtools::load_all),
but if you're interested please take a look. I'm hoping a stronger
foundation will make it easier to keep on top of the bugs and to
flexibly implement new features.
Hadley
_______________________________________________
Roxygen-devel mailing list
Roxygen-devel at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/roxygen-devel
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Vitalie Spinu
2012-08-28 16:27:15 UTC
Permalink
Hi Hadley,
Hadley Wickham <hadley at rice.edu>
HW> Hi all,
HW> I thought I should mention I've started working on roxygen3 at
HW> https://github.com/hadley/roxygen3. It's a ground up rewrite of
HW> roxygen2, aiming to produce the same results as roxygen3, but without
HW> a completely different backend. Currently the roxygen2 code is hard
HW> to extend - the idea of roclets was good, but I think they were too
HW> big - you want to be able to work at the tag level so it's easier to
HW> add new features.

Is it object oriented this time? From what I can follow from the HEAD,
it comes pretty close, but it is not S4 driven.

May be I am missing the point here, but it looks to me that doc/tag
parsers should be generics. A package might want to document and parse
the roxy-doc of it's objects in a class dependent way. So why not to use
the native S4 mechanism?

Also would be good to have a roc_template generic, which would generate
a bare-bone documentation of an object. This is for editors to be able
to quickly insert a roxygen template based on the object at the cursor
and its class.

Thanks,
Vitalie
Hadley Wickham
2012-08-28 16:46:48 UTC
Permalink
Post by Vitalie Spinu
Is it object oriented this time? From what I can follow from the HEAD,
it comes pretty close, but it is not S4 driven.
Yes, it's more object-oriented, but with S3, not S4.
Post by Vitalie Spinu
May be I am missing the point here, but it looks to me that doc/tag
parsers should be generics. A package might want to document and parse
the roxy-doc of it's objects in a class dependent way. So why not to use
the native S4 mechanism?
See the doctype, object_from_call, usage and default_exports generics.
I don't see the need for every tag to be class aware just a few.
Post by Vitalie Spinu
Also would be good to have a roc_template generic, which would generate
a bare-bone documentation of an object. This is for editors to be able
to quickly insert a roxygen template based on the object at the cursor
and its class.
Can you give an example? Generally, if you can automatically generate
the template, why can't you automatically generate the Rd directly?

Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Hadley Wickham
2012-08-28 16:49:16 UTC
Permalink
I've also been wondering about splitting "roxygen3" into two packages
- one that defines all the basic objects etc and one that creates all
the tags. That would would more cleanly separate the implementation
from the tag system, and make it easier for others to use roxygen to
build alternative documentation systems. One priority for roxygen3 is
that it should be much easier to extend, so that if I don't want to
include something, it's still easy for you to put it in an add-on
package.

Hadley
Post by Hadley Wickham
Post by Vitalie Spinu
Is it object oriented this time? From what I can follow from the HEAD,
it comes pretty close, but it is not S4 driven.
Yes, it's more object-oriented, but with S3, not S4.
Post by Vitalie Spinu
May be I am missing the point here, but it looks to me that doc/tag
parsers should be generics. A package might want to document and parse
the roxy-doc of it's objects in a class dependent way. So why not to use
the native S4 mechanism?
See the doctype, object_from_call, usage and default_exports generics.
I don't see the need for every tag to be class aware just a few.
Post by Vitalie Spinu
Also would be good to have a roc_template generic, which would generate
a bare-bone documentation of an object. This is for editors to be able
to quickly insert a roxygen template based on the object at the cursor
and its class.
Can you give an example? Generally, if you can automatically generate
the template, why can't you automatically generate the Rd directly?
Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Vitalie Spinu
2012-08-28 19:59:24 UTC
Permalink
Hadley Wickham <hadley at rice.edu>
HW> I've also been wondering about splitting "roxygen3" into two packages
HW> - one that defines all the basic objects etc and one that creates all
HW> the tags.

An add-on package would have to load both right? So what is the
advantage of the split then?
Post by Vitalie Spinu
May be I am missing the point here, but it looks to me that doc/tag
parsers should be generics. A package might want to document and parse
the roxy-doc of it's objects in a class dependent way. So why not to use
the native S4 mechanism?
See the doctype, object_from_call, usage and default_exports generics.
Ok, I looked into it.

It's still the pseudo dispatch on textual representation of the object
definition. That is you parse the "foo <- function(" and dispatch
(object_from_call) on "function", setGeneric(foo) is dispatched on
"setGeneric" etc.

Wouldn't it be better to inspect the evaluation environment for the
traces of the evaluation and then dispatch on the objects discovered?
Then the code

aaa <- local({ ... compute object ..})

will correctly dispatch on aaa and won't be ignored.

It will be possible to create documentation for a bunch of objects at
the same time. For example

local({ a <- generate_object_a()
b <- generate_object_b()})

will document both a and b.
I don't see the need for every tag to be class aware just a few.
This is a complication. You ending up in implementing your own OO
system. For example, from roccer-.r:

#' The roccer object is a key component in roxygen3 - it defines the behaviour
#' of a tag with a \code{parser} and a \code{output} write.

Why would you need an roccer if you already have "classes" and "methods"
to define the behavior of the tag?

You can just have a virtual S4 class "roxy_tag". Then subclass
"roxy_tag_oxygen" and have all other tags derive from that. Most of them
will probably have only two slots, "name" and "text", but some like
"slots" tag will have more.

Then you can have "roxy_split_oxigen(object, doc)" generic dispatched on
object which would split the string 'doc' into tags. Each tag is an
object. Then another generic "roxy_parse(tag)" to actually parse the
tag. Another generic "roxy_rd(tag)" to generate rd entry, and yet
another generic "roxy_template(tag)" to generate template. And so on.

The end user can getClass("roxy_tag") to see all the tags which
are available. Same applies to methods.

To add a markdown syntax or whatever, you can just define
"roxy_tag_markdown" class for tags, and add corresponding methods
"roxy_parse", "roxy_split_markdown" etc

All of this looks simple, consistent and transparent to me. In order to
extend roxygen, one would not need to dig into the code and learn how it
works and try to find workarounds to implement features which are not
there. But, instead, just start writing methods and classes directly.
Post by Vitalie Spinu
Also would be good to have a roc_template generic, which would generate
a bare-bone documentation of an object. This is for editors to be able
to quickly insert a roxygen template based on the object at the cursor
and its class.
Can you give an example? Generally, if you can automatically generate
the template, why can't you automatically generate the Rd directly?
What I meant here is the following. Suppose you have a function
declaration:

foo <- function(a = 4, b = 34){
a + b
}

To start documenting the function, a user might want to insert a
skeleton for a documentation (template) like following

##' ..description
##'
##' @title
##' @param a
##' @param b
##' @return
##' @author User Name
##' @examples

Depending on the editor, this might be bound to a key.

There might be also a roxy_update(OBJECT, OLD_TEXT) method which would
take OLD_TEXT and output a modified version of it to account for changes
in OBJECT. For example, if you have documented parameters a and b above,
and then decided to rename b into c, then roxy_update will just change
@param b into @param c. An editor can bind this to a key.

Vitalie
Hadley Wickham
2012-08-28 20:19:46 UTC
Permalink
Post by Vitalie Spinu
HW> I've also been wondering about splitting "roxygen3" into two packages
HW> - one that defines all the basic objects etc and one that creates all
HW> the tags.
An add-on package would have to load both right? So what is the
advantage of the split then?
* When you look at the documentation, there's less confusion between
what a user needs, and what a developer needs

* You can use the roxygen framework without buying into any of my
documentation philosophy.
Post by Vitalie Spinu
Ok, I looked into it.
It's still the pseudo dispatch on textual representation of the object
definition. That is you parse the "foo <- function(" and dispatch
(object_from_call) on "function", setGeneric(foo) is dispatched on
"setGeneric" etc.
Wouldn't it be better to inspect the evaluation environment for the
traces of the evaluation and then dispatch on the objects discovered?
Then the code
aaa <- local({ ... compute object ..})
will correctly dispatch on aaa and won't be ignored.
It does do that - but you need to parse the call because there are
number of calls that produce global side effects (e.g. creating
classes and methods). You could try doing it after the fact (e.g. by
using S4 introspection to find all the objects to document), but then
it's much more difficult to match the documentation block with the
object.

The pseudo-S3 dispatch isn't particularly elegant, but it seemed liked
a good 90% solution.
Post by Vitalie Spinu
It will be possible to create documentation for a bunch of objects at
the same time. For example
local({ a <- generate_object_a()
b <- generate_object_b()})
will document both a and b.
Could you flesh out this example a bit more? I don't understand why
you'd want to document objects that aren't evaluated by the user.
Post by Vitalie Spinu
Post by Hadley Wickham
I don't see the need for every tag to be class aware just a few.
This is a complication. You ending up in implementing your own OO
#' The roccer object is a key component in roxygen3 - it defines the behaviour
#' of a tag with a \code{parser} and a \code{output} write.
Why would you need an roccer if you already have "classes" and "methods"
to define the behavior of the tag?
I'm not sure I get your point - the roccer _is_ the object that
represents the tag.
Post by Vitalie Spinu
You can just have a virtual S4 class "roxy_tag". Then subclass
"roxy_tag_oxygen" and have all other tags derive from that. Most of them
will probably have only two slots, "name" and "text", but some like
"slots" tag will have more.
Then you can have "roxy_split_oxigen(object, doc)" generic dispatched on
object which would split the string 'doc' into tags. Each tag is an
object. Then another generic "roxy_parse(tag)" to actually parse the
tag. Another generic "roxy_rd(tag)" to generate rd entry, and yet
another generic "roxy_template(tag)" to generate template. And so on.
I think this is more inline with how roxygen2 works. You can't have a
methods that just work with a single documentation block + object
(rocblock for short) at a time, because some tags work more globally
(e.g. @family, @include, @inheritParams). That's more of a comment on
your suggested function names rather than using S4 - but it does have
a big impact on the API.
Post by Vitalie Spinu
The end user can getClass("roxy_tag") to see all the tags which
are available. Same applies to methods.
Hmmm, that would be nice.
Post by Vitalie Spinu
All of this looks simple, consistent and transparent to me. In order to
extend roxygen, one would not need to dig into the code and learn how it
works and try to find workarounds to implement features which are not
there. But, instead, just start writing methods and classes directly.
I think there are two issues at play:

* whether to use S3 or S4
* the design of the object system

I think the object system can definitely be improved (and your
discussion is really helpful), but I'm not convinced that using S4
over S3 brings enough advantages to make it worthwhile. It would be
very useful if you could lay out what the main advantages of S4 in
your mind are in this situation. That would help me think it through.
(Two advantages: it would force me to make roxygen S4 support a lot
better, and would force me to use S4 on a larger project ;)

My feeling is that generally R users are more familiar and comfortable
with S3 rather than S4. So it might make it less likely to get
contributions.
Post by Vitalie Spinu
Post by Hadley Wickham
Can you give an example? Generally, if you can automatically generate
the template, why can't you automatically generate the Rd directly?
What I meant here is the following. Suppose you have a function
foo <- function(a = 4, b = 34){
a + b
}
To start documenting the function, a user might want to insert a
skeleton for a documentation (template) like following
##' ..description
##'
Depending on the editor, this might be bound to a key.
Ah, I see. And you see this being the role of roxygen to generate,
rather than the editor? It seems like there's a lot that people could
Post by Vitalie Spinu
There might be also a roxy_update(OBJECT, OLD_TEXT) method which would
take OLD_TEXT and output a modified version of it to account for changes
in OBJECT. For example, if you have documented parameters a and b above,
and then decided to rename b into c, then roxy_update will just change
@param b into @param c. An editor can bind this to a key.
I think that's a nice idea, but a lot of work!

Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Vitalie Spinu
2012-08-28 23:46:51 UTC
Permalink
Hadley Wickham <hadley at rice.edu>
Wouldn't it be better to inspect the evaluation environment for the
traces of the evaluation and then dispatch on the objects discovered?
Then the code
aaa <- local({ ... compute object ..})
will correctly dispatch on aaa and won't be ignored.
HW> It does do that - but you need to parse the call because there are
HW> number of calls that produce global side effects (e.g. creating
HW> classes and methods). You could try doing it after the fact (e.g. by
HW> using S4 introspection to find all the objects to document), but then
HW> it's much more difficult to match the documentation block with the
HW> object.

Not that difficult, S4 always leave traces in the evaluation
environment, objects starting with:

methods:::.TableMetaPattern()
[1] "^[.]__T__"
methods:::.ClassMetaPattern()
[1] "^[.]__C__"

Inspecting those, you know exactly what was installed.

S3 methods are left in ".__S3MethodsTable__." object locally. So no
trouble at all. I have done this before, and can provide the necessary
code.

It looks like an internal hackery, but it's really not. This
implementation will hardly ever change, and if changes, will be easy to
adapt.
It will be possible to create documentation for a bunch of objects at
the same time. For example
local({ a <- generate_object_a()
b <- generate_object_b()})
will document both a and b.
HW> Could you flesh out this example a bit more? I don't understand why
HW> you'd want to document objects that aren't evaluated by the user.

Ah sorry, that was stupid. I meant

eval({ a <- generate_object_a()
b <- generate_object_b()})

Roxygen can make a convention if two declarations are followed
imidiately after each over they souled be documented in the same
roxy-doc and same Rd file:

foo <- function(a) ..
boo <- function(a) ..

will put both foo and boo in the same file. Curently one needs two
documentation blocks and rdname tag if I am not mistaken.
Post by Hadley Wickham
I don't see the need for every tag to be class aware just a few.
This is a complication. You ending up in implementing your own OO
#' The roccer object is a key component in roxygen3 - it defines the behaviour
#' of a tag with a \code{parser} and a \code{output} write.
Why would you need an roccer if you already have "classes" and "methods"
to define the behavior of the tag?
HW> I'm not sure I get your point - the roccer _is_ the object that
HW> represents the tag.

You meant roccer as an abstract encapsulation of the behavior of the
tag. That is

structure(list(name = name, parser = parser, output = output), class = "roccer")

An abstract notion of a "tag" has a representation with a name and two
methods which describe the behavior of a tag. This is precisely the task
of a class/method system.

add_rosser and roccer functions are basically a replacement for setClass
and setMethod.

In S4 instead of roccer + add_rocer + basic_roccer + etc you might do:

setClass("RoxyTag", list(name = "character"))
setGeneric("roxyParse", function(tag) NULL)
setGeneric("roxyRd", function(tag) NULL)


For every tag:

setClass("RoxyFamily", contanins = "RoxyTag")
setMethod("roxyParse", signature = "RoxyFamily",
def = ... )

instead of

parse_family <- function() ...
roc_family <- roccer("family",
roc_parser(tag = text_tag(), all = parse_family))

It looks like you want a simple interface, but it ends up being a cross
between S3 and internal roxygen object (i.e. tag) keeping system. Sort
of _roxyClasses_ approach, and it looks like you haven't yet get to the
inheritance and extension mechanism.

(Actually, I think I started understanding why you proposed to split the
package and to keep tags separately. To simplify the extension of
tags(that is it?). If the tags are S4 classes, then this is not a
problem. Any package can extend the system!)

To wrap up, quite some of the current code is essentially an OO keeping,
and can be completely eliminated by delegating the work to S4.
You can just have a virtual S4 class "roxy_tag". Then subclass
"roxy_tag_oxygen" and have all other tags derive from that. Most of them
will probably have only two slots, "name" and "text", but some like
"slots" tag will have more.
Then you can have "roxy_split_oxigen(object, doc)" generic dispatched on
object which would split the string 'doc' into tags. Each tag is an
object. Then another generic "roxy_parse(tag)" to actually parse the
tag. Another generic "roxy_rd(tag)" to generate rd entry, and yet
another generic "roxy_template(tag)" to generate template. And so on.
HW> I think this is more inline with how roxygen2 works. You can't have a
HW> methods that just work with a single documentation block + object
HW> (rocblock for short) at a time, because some tags work more globally
HW> (e.g. @family, @include, @inheritParams). That's more of a comment on
HW> your suggested function names rather than using S4 - but it does have
HW> a big impact on the API.

I didn't think about that. I barely understand how roxygen works as
yet:). You have rockblock_parser class already. I guess it's just a
question of S3 or S4 then.

Actually, the parsing generic (roxyParse or whatever) can by default
take two arguments, the object and the whole bunch of rocblocks. Each
tag will decide for itself whether to use rockblocks argument or not.

I guess you are already doing something similar, but I am a bit confused
of why the distinction between parse_rocblocks and roccer$parser is
necessary.
The end user can getClass("roxy_tag") to see all the tags which
are available. Same applies to methods.
HW> Hmmm, that would be nice.

Especially if other packages extend the tag system :)
All of this looks simple, consistent and transparent to me. In order to
extend roxygen, one would not need to dig into the code and learn how it
works and try to find workarounds to implement features which are not
there. But, instead, just start writing methods and classes directly.
HW> I think there are two issues at play:

HW> * whether to use S3 or S4
HW> * the design of the object system

HW> I think the object system can definitely be improved (and your
HW> discussion is really helpful), but I'm not convinced that using S4
HW> over S3 brings enough advantages to make it worthwhile. It would be
HW> very useful if you could lay out what the main advantages of S4 in
HW> your mind are in this situation. That would help me think it through.
HW> (Two advantages: it would force me to make roxygen S4 support a lot
HW> better, and would force me to use S4 on a larger project ;)

HW> My feeling is that generally R users are more familiar and comfortable
HW> with S3 rather than S4. So it might make it less likely to get
HW> contributions.

Now roxy users have to learn roxyClasses system ;). And by building new
packages on S3 you actually contributing to rooting and roting of
S3. It's surprising why people are so stuck with it. S4 is so simple;
there are only two main functions setClass, setMethod. Nobody needs to
know more.

I can hardly add anything new to well known S4 advantages. Here are a
couple of obvious thoughts:

- S4 is R standard and R-core encourages using it. S3 is virtually
subsumed to S4 right now.

- Extension across packages is completely handled in the background,
and it takes a huge load of your shoulders. There are myriads of
classes and objects which people might want to document in a
different way: RefClasses, Rjava, Cpp, proto, etc. Roxygen should
not care about them. Each package should define it's own tags,
parsers, Rd converters etc.

- S4 is actually a good system -- multiple inheritance and multiple
dispatch. Thing which is implemented only by a handful of languages.

- Roxygen might need multiple dispatch/inheritance, even if it is not
apparent right now.

- Type checks are done automatically.

- Building new tags on top of others is easy. Inheriting the behavior
is automatic.

- Conversion between current system of roccers to S4 is trivial. You
handle almost everything as a list structures which will become S4
objects with slots.

- Explaining how to extend roxygen won't take more than half a page:

Reading source file -> Spliting in rocblocks -> Parsing with
roxyParse -> Conveting to Rd with roxyRd. Please write your
roxyParse and roxyRd methods.

- People will finally learn some S4 :)

There must be more, but it's getting too late.
Post by Hadley Wickham
Can you give an example? Generally, if you can automatically generate
the template, why can't you automatically generate the Rd directly?
What I meant here is the following. Suppose you have a function
foo <- function(a = 4, b = 34){
a + b
}
To start documenting the function, a user might want to insert a
skeleton for a documentation (template) like following
##' ..description
##'
Depending on the editor, this might be bound to a key.
HW> Ah, I see. And you see this being the role of roxygen to generate,
HW> rather than the editor?

Right, but if one editor have already done that (ESS for example) why
not to reuse the code in an editor independent way?

HW> It seems like there's a lot that people could disagree on (##' vs
HW> #', do you need @author) etc.

It could be customized in options$roxygen or alike. If roxygen
establishes an uniform customization interface, then editors will be
forced to pick it up from there.
There might be also a roxy_update(OBJECT, OLD_TEXT) method which would
take OLD_TEXT and output a modified version of it to account for changes
in OBJECT. For example, if you have documented parameters a and b above,
and then decided to rename b into c, then roxy_update will just change
@param b into @param c. An editor can bind this to a key.
HW> I think that's a nice idea, but a lot of work!

I would leave that to users and editors. ESS does the updating well for
functions, and the functionality will come pretty fast for other
objects, once the proper interface is in place.

Vitalie.
Hadley Wickham
2012-08-29 13:25:50 UTC
Permalink
Post by Vitalie Spinu
It looks like you want a simple interface, but it ends up being a cross
between S3 and internal roxygen object (i.e. tag) keeping system. Sort
of _roxyClasses_ approach, and it looks like you haven't yet get to the
inheritance and extension mechanism.
Hmmm, point taken.
Post by Vitalie Spinu
(Actually, I think I started understanding why you proposed to split the
package and to keep tags separately. To simplify the extension of
tags(that is it?). If the tags are S4 classes, then this is not a
problem. Any package can extend the system!)
I'm more interested in establishing the boundaries between "this is a
useful system for writing a documentation domain specific language"
and "this is how Hadley thinks you should do documentation".
Post by Vitalie Spinu
HW> I think this is more inline with how roxygen2 works. You can't have a
HW> methods that just work with a single documentation block + object
HW> (rocblock for short) at a time, because some tags work more globally
HW> your suggested function names rather than using S4 - but it does have
HW> a big impact on the API.
I didn't think about that. I barely understand how roxygen works as
yet:). You have rockblock_parser class already. I guess it's just a
question of S3 or S4 then.
I'd appreciate any comments on the overall design as described in my
other email.
Post by Vitalie Spinu
Actually, the parsing generic (roxyParse or whatever) can by default
take two arguments, the object and the whole bunch of rocblocks. Each
tag will decide for itself whether to use rockblocks argument or not.
I think you need some subclasses that work with a single tag or a
single roc, because you can cache them more efficiently.
Post by Vitalie Spinu
I guess you are already doing something similar, but I am a bit confused
of why the distinction between parse_rocblocks and roccer$parser is
necessary.
Me too ;)
Post by Vitalie Spinu
HW> My feeling is that generally R users are more familiar and comfortable
HW> with S3 rather than S4. So it might make it less likely to get
HW> contributions.
Now roxy users have to learn roxyClasses system ;). And by building new
packages on S3 you actually contributing to rooting and roting of
S3. It's surprising why people are so stuck with it. S4 is so simple;
there are only two main functions setClass, setMethod. Nobody needs to
know more.
Well, you also need to know something about how methods are
dispatched, and you need to invest much more in up-front design.
Post by Vitalie Spinu
I can hardly add anything new to well known S4 advantages. Here are a
- S4 is R standard and R-core encourages using it. S3 is virtually
subsumed to S4 right now.
I think that's overstating the case - some of R core like S4, and some
of them hate it. There are still plenty of R core members writing new
S3 classes, generics and methods.

But I accept your other points, and it would be good practice for me
to do a package in S4 style. I'll have a go at converting to S4 in the
next week or so. I'll probably finish off the S4 documentation code
with the current system (so roxygen3 is basically feature complete)
and then I'll start the process of converting to S4.
Post by Vitalie Spinu
HW> Ah, I see. And you see this being the role of roxygen to generate,
HW> rather than the editor?
Right, but if one editor have already done that (ESS for example) why
not to reuse the code in an editor independent way?
HW> It seems like there's a lot that people could disagree on (##' vs
It could be customized in options$roxygen or alike. If roxygen
establishes an uniform customization interface, then editors will be
forced to pick it up from there.
This isn't something I'm that interested in, but it would certainly be
easy for someone to submit a patch to implement it :)

Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Peter Danenberg
2012-09-08 18:52:33 UTC
Permalink
There's no way we could develop this as a branch in the pre?xisting
Roxygen repo, is there?
Post by Hadley Wickham
Hi all,
I thought I should mention I've started working on roxygen3 at
https://github.com/hadley/roxygen3. It's a ground up rewrite of
roxygen2, aiming to produce the same results as roxygen3, but without
a completely different backend. Currently the roxygen2 code is hard
to extend - the idea of roclets was good, but I think they were too
big - you want to be able to work at the tag level so it's easier to
add new features.
parse_dev <- function(roc, ...) {
if (is.null(roc$dev)) return()
list(
title = str_c("[DEV] ", roc$title),
description = c("This function is useful only for developers",
roc$description),
dev = NULL)
}
add_roccer("dev", roc_parser(one = parse_dev))
base_prereqs[["dev"]] <- c("_intro", "title", "details")
that the function is more suitable for developers than end-users.
The interface is likely to change a lot, and I doubt the package
currently installs (although it does work with devtools::load_all),
but if you're interested please take a look. I'm hoping a stronger
foundation will make it easier to keep on top of the bugs and to
flexibly implement new features.
Hadley
--
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/
Hadley Wickham
2012-09-10 21:17:23 UTC
Permalink
Post by Peter Danenberg
There's no way we could develop this as a branch in the pre?xisting
Roxygen repo, is there?
Yes, probably. I'm becoming more certain that this won't break any
existing roxygen code, so I'll move back over in the not too distant
future.

Hadley
--
RStudio / Rice University
http://had.co.nz/
Peter Danenberg
2012-11-15 11:24:57 UTC
Permalink
Currently the roxygen2 code is hard to extend - the idea of roclets
was good, but I think they were too big - you want to be able to
work at the tag level so it's easier to add new features.
I've been thinking a little bit about this, and it would be nice if
the roclet (viz. roccer) model resembled roxygen1 in the sense that it
were based on function-composition.

I did a live-coding demo at LARUG where I showed the audience how to
create a custom roclet; and all the coding-by-convention stuff (e.g.
remembering to write roc_process and roc_output) was a little painful.

The original idea was to chain a series of roclets together that
optionally (and perhaps non-destructively) altered the parse-tree in
some specified order; possibly doing I/O.

This enabled the creation of e.g. translation roclets that would alter
the description associated with srcrefs for later processing; to
accomodate which in roxygen2, I had to do awful things like assigning
to the parent frame:

https://github.com/klutometis/larug-roxygen/blob/master/translation-roclet.R

It was painfully laborious compared to the mapping/composition model.

Anyway, you may have solidified the architecture at this point; food
for thought, though.

Continue reading on narkive:
Loading...