Skip to content

Instantly share code, notes, and snippets.

@jlipps
Created September 3, 2013 21:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jlipps/6430119 to your computer and use it in GitHub Desktop.
Save jlipps/6430119 to your computer and use it in GitHub Desktop.
Day 1: Spec Planning
Introductions :)
jlipps: works at Sauce and Appium
santi: works at Sauce. Will be architecting the backend of Sauce to support this spec.
dburns: works at Mozilla. Works on Marionette (webdriver with extras for Mozilla)
freynaud: ebay. iosdriver
je: works at facebook. ramping up on iosdriver work at facebook
dominik: ebay. creator of selendroid.
shs: Creator of Webdriver and leading the Se project. Working at Facebook, mainly focussed on Android and an android build tool.
Gestures
=======
(David Whiteboarding on Gestures API spec)
Mozilla put on a spec for mobile gestures with efficiency in mind and extensibility.
We've taken the normal gestures and let's say a line is a gesture, it packs them all the pieces together and sends them all to the other end, which unpacks them. Another problem that had to be handled, are things like 2 or 3 fingers, specially when you want to make these play independetnly from each other.
For this, they've created multi-action. Though which you create actions in the standard way, pack them in as multi-actions and send them altogether to the other side. Actions are broken into time units, called beats (vertical lines on a grid, where horizontal lines are each finger's actions). This enabled us to support things like touch enabled tables, which can be extended to an undefined about of actions in concurrency.
simon: can you do sparse arrays. Say we begin with 2, then switch to 5.
David: We've come up with kind of a dirty approach. Where we use blanks that behave as no-ops when an actual action is done with the missing fingers.
jlipps: so in the case of ios-driver, appium or selendroid, we'll need to come up with ways to implement this approach using the technologies we have available in each platform, so I'd love to see some implementation examples to wrap my head around it.
simon: so each of these multi actions is basically a single action, which you guys already deal with
jlipps: I see, so this is just batching them all together and enabling the servers to execute them in synchrony.
...
jlipps: and how do we wrap this up in a nice way on the client side.
David: which benefits Sauce in a lot of ways, as it's a lot more network efficient for a high traffic task.
jlipps: so our job then is to disect these sequenses of actinos and figure out how to interpret these actions to find easier events like swipes.
francois: which can lead to "unsupported" actions depending on the sever side.
simon: google was at the last w3c and they were perfectly happy with this spec.
Jason: how do you line up beats with time or events?
David: you have to do it by the way it's structured. If the implementation is done in js, you wouldn't be able to specify a propper interval for these, which makes things very difficult. The longest running action in a group will determine when the next one will happen. Say I put a 5s wait in action 3, it blocks action 1 (which starts on its end) until it can start. They will happen as close to parallel as possible.
francois: how do I debug what actually happens?
simon: we'll have to log down using the webdriver api.
francois: this will be very tricky hard to debug. specially in undefined actions like 3 finger click which one finger down
simon: do we see this happen at all in the current actions api?
francois: we just need to debug this in detail. We need to understand what happens.
simon: more logging is the best we can do.
David: does it make sense at a high level
jlipps: so it's basically the actions API, with a way to run them in parallel. Is there an implementation?
david: marionette
simon: which will sonn make it to the webdriver clients with enough time.
jlipps: perfect. I can look at the api then and start testing with it.
simon: you should chat to Jason Leyba
Code from Mozilla http://mxr.mozilla.org/mozilla-central/source/testing/marionette/client/marionette/marionette.py#164
Jlipps to scribe
moving on from gestures API
simon: do we have the necessary interaction primitives?
moving back to gestures
primitives we expect to see going across the wire:
david: longpress, tap, press, release, move, wait
santi requesting clarification since he was scrivening
jlipps: why longpress instead of tap with duration?
david: different OSes have different times to load up context menu via long press
david: also, swipe/flick on some OSes are different, some the same
simon: can we differentiate b/t swipe/flick
santi: let's not re-interpret things multiple times
simon: tension between being expressive on client side and minimal on action apis
santi: if we start finding common patterns, why not promote to primitive
dominik: so if we want to do just one thing, stick with touch api?
simon: well.... these will eventually be deprecated. let's just send a batch of one action using multi-actions
simon: do we all agree on the above primitives?
all: we agree
Locators
======
jlipps: We added this to figure out how to locate elements in platforms like android or ios, which don't resemble the way the web does it.
FR: I use get elementree and then do xpath
JL: How do you dump and use?
FR: I get the ID and build own structure and then
SS: We use bully and that has a clean
FR: we dont have. ANdroid is easier. There is no element equality in iOS. element == element === false;
SS: we moved it to the client side for simplicity
JL:
FR: We would like to CLass name from APple to simplify the iOSDriver works. This uses the UIA stuff.
JL: Is that because appium uses short cuts
FR: No, people want it to look like the web, even though its not like that but it helps users
JL: The canonical name should be used and we shouldnt change it.
SS: Names must be lower case and with spaces since its just a string. WebDriver is say what you mean except including for locators:
<missed>
JL: I dont want anything that isnt semantic
DD: in selendroid by tag uses the class name and we look it up.
SS: Bully has its own unique FindBy classes. It doesnt implement things that arent in ANdroid. If its not there it compile since it cant cast to it. We can create a WP8 By class if need be. It maps closes to UIAutomation.
JL: Create a FindBy for accessibility so that goes cross platform
SS: THis maps nicely to ARIA
FR: We decide or W3C? W3C will get it for WebDriver 1.1 we can uplift it to them later. We can agree and implement and then push.
SS: **** SHOWING CODING EXAMPLE***** Shows how it works in Bully and how everything is built up. Bully w
JL and SS: Tag Name, Class name, Accessibility.
JL: You (FR) use localisation in your tests. How does that work
FR: We change it en route to the server and we change it. We use Apple regex which is "fun". Lets discuss this later so not to
DD: If you pass in a l10n key I then try change it to help with find by text. The real element has l10n key in the XML but doesnt translate to the screen.
FR: We have access to the bare bones of the app which isnt like the web
SS: Yes and no. We have a
JL: Why dont we have L10n in the spec?
FR: I need to test 12 languages. I want to test the functionality not the l10n. Do we unpack on client or server? On the Server.
SS: If its not on the server we need to replicate on different languages.
JL: Feels like we need a server extension e.g. ADB commands
FR: We have decorators to do that.
SS: Facebook has extended the protocol as per spec (-fb-*) so that easy to extend.
Santi: FR implementation is easy to do and simply to do.
FR: There are drawbacks that we need to be aware of like it cant handle implicit waits
JL: Do we need xpath?
FR and SS: Yes. Xpath needs to be our fall back because people dont make accessible apps.
SS: XPath is great for when you're in a tricky spot.
JL: We will need to reimplement things
Santi: We will so thats not a real problem.
DD: We have a common source of truth to create our XPATH. We need to agree on this
SS: Bully takes the heirachy and dumps that.
Santi: JSON or XML? XML seems standard
FR: JSON is better
JL: We have both depending on what the user want.
SS: We want a canoncal source of truch for the UI of the app and see how that maps to
Santi: Then it should be XML
SS: XML and JSON readability can help but if its
Santi: We could use YAML.
JL: XML is the easiest to dump into another tool to do XPATH.
FR: What do we put in the XML? L10n?
SS: Dump the JSON given and translate to XML. WE can then add extensions to that to manage what people want.
SS: Selenium server handles less headers.
SS: What is the API going to look like? Do we pass in Enums?
Santi: Strings?
SS: They arent good at describing where the user should go.
Scrolling to elements
JL:How do we do it?
DB: Actions? Doesnt that work?
JL: Andorid only renders this as you need it. E.g. ListViews?
DD: Selendroid has a client implementation to try figure out if its there and we need to try scroll it.
SS: WebDriver, and this, shouldnt have a scroll because people have differing sizes and we have issues so lets not do it.
DD: I need a special case for ListViews.
Santi: Lets push to the server side so that simplifies things.
SS: Bully will return screen size in capabilities returned.
Santi: Its all edge cases
FR: iOS does this nicely.
Jason: it just feels like magic and only "mostly" works. I really dislike that! The webdriver server
FR: Actions can handle the stuff that the auto scroll cant do. That requires you understand you app.
SS: We need to raise a bug with Android to expose a few more features like iOS do.
WebViews and Other Contexts
======================
DB: Marionette doesn't follow the normal conventions of webdriver. Use a "set" command. Purposefully hard for non-moz engineers to use. In B2G, everything is a webview, but it's multiprocess (makes it more secure) Makes things like handling alerts harder, but marionette handles that for you.
JL: What about the case when an app contains an embedded webview
Everyone: we do slightly different things, but based on switchToWindow() with a parameter.
SS: Why not just cast a WebElement to a WebDriver?
Santi: How would that look on the wire protocol
SS: No idea.
SS: Perhaps send back an additional piece of information that indicates that this is a webview of some sort.
DB: We plan to do this already for HTML5 input elements (eg: DateTime)
FR: What about different webviews on the page. Outlines example where finding the webview is tricky.
JL: Appium queries the debug protocol.
FR: But that can't be used to find the window.
Discussions ensue. Options considered:
1) Stick with the current approaches
2) WebElement e = driver.findElement(By.id("webview")); WebDriver d = (WebDriver) e;
3) Set<String> cs = driver.getContexts(); driver.switchTo().context(cs.get(0));
Decided to go with option 3 as it's consistent with the existing model. SS sad #2 isn't a viable option :(
JL: What about handling commands that are available in native but not in web?
SS: Use UnsupportedCommandException?
JL: Perhaps, but that implies nothing on the server understands that command, which may not be true.
SS: "InvalidInContextException" to match our existing namespace.
Session Handling
=============
shs: bully: 1 session per device. 2nd session is error
jlipps: appium new session clobbers old
shs: bully can run multiple sessions on different devices
all agree it should error
Implicit Waits
==========
SS: Implicit waits are added for people who aren't aware of the lifecycle of their pages or aren't sure where to put explicit waits. Seems reasonable to have the same concept in mobile.
SS: BTW, "Wait<>" was designed to be extensible, and ExpectedConditions were added to handle the common cases. Should be possible to add a "ServerSideWait" that can push those across the wire to be executed server-side.
Desired Capabilities
===============
Device vs Emulator: 'deviceName'
App (ios only?): 'app'
Automation backend (iosdriver/appium/selendroid): 'automationName',
OS on device: platformName,
Version of OS on device: platformVersion
App is also important for selendroid
Caps agreed upon:
--------------------------
platformName: Android, iOS
platformVersion: 4.3, 7
deviceName: Nexus 4, iPhone 4S, Simulator
automationName: Appium, ios-driver
app (optional): /path/to/app/to/use.app
browserName (optional): Safari
Supra-App Testing
==============
Hardware controls: simons to OSS bits of bully
JL: What about the soft keyboard?
DB + FR: we expose that somehow. Using magic.
Discussion about orientation and names for these things. Accelerometer to be left out for now.
DB: http://www.w3.org/TR/screen-orientation/
Day 2
=====
Adding to w3c
-------
db: not going into webdriver spec. need a new spec. will fork and work on it around november. can talk about uplifting it. françois and dominik will talk to ebay rep for w3c and join the working group. no time commitment other than a few e-mails.
db: for saucers, maybe they can join w3c or become invited guests for the working group
db: so we fork webdriver spec, get whatever we've decided and documented here, uplift it to w3c, where there will be googlers, microsofties, etc, where they will have lots of comments based on their stake. that's the process to consensus
jl: we can start working on what we will uplift now
db: yes we can agree on straw man and do all documentation, simon and i can edit it to make it spec-like.
jl: we can do authoring.
db: once straw man is created we can have more discussion, then discuss it at working group, land it, go shopping.
jl: i'll create an authorship document
db: be careful of the way you word things, because you want to remove any ambiguity for a spec. don't use 'should'. use 'must', or just direct.
Server-side chages roadmap
------
ss: so I'd like to have a date in which all server-side implementations can converge into the spec we've all agreed upon. The goal is to be able to run a single test using two tools like selendroid and appium (or iosdriver and appium, or firefox os) without changes
fr: I can definitely promise, that's cheap :)
jl: I will be planning on a 1.0 that breaks backwards compat for this
fr: I'll be doing a beta version in which breaking APIs are announced as deprecated.
jl: do you guys think we can aim for these are short term
fr: I have some things for the end of the year and vacation in september. Otherwise, I can focus 100% on this.
dd: I have plenty of time for this, although there are other tasks.
ss: so we'll need the spec first
jl: I can get that done in a few days
fr: once I have the spec, it will be fairly easy
Test suite
-----
jl: it'd be great to ahve a suite that just works on all of these tools as a litmus test for ourselves. In earlier conversations with Simon, he seemed to agree on this being maintained by the selenium project.
db: so how do we build an app that we can use across platforms?
fr: we could use UICatalog. Aren't both of us using that?
jl: yup, I am. we could cover 90% of the features using that app
jl: there's a similar thing for android, "API Demos"
dd: I looked into it, though its webview is crap. Do you know if we can get the source?
jl: yeah, I have that and have already forked it. We could maintain that fork together and add the features both our projects need. We could put the Selenium html test files in the app as well.
ss: do we serve the Selenium playground html pages?
dd: I'm already doing that. Serving the files locally from the phone.
jl: I don't think we should serve from the web. These should be used locally in the phone.
ss: ok, so we've got android and ios covered. For firefox os we can just use the selenium playground?
db: yeah, we can use that
jl: It'd be nice to have a more "native" app, with most of these features covered like orientation and mobile
ss: then we can use that in the android's and ios' webview tests
jl: so will you implement switchToContext?
db: we could alias it. We do something similar, so it would work.
jl: we'll chat about it once we have the document ready
fr: it'd be nice to call it switchToDriver instead of context so users know when they run on native context vs web. Safari automation on a real device won't be able to come out of the webview context.
jl: that's right.
ss: this happens in desktop browsers. Users just can't ever switch. Shoudn't we just throw an exception when driving mobile safari?
fr: well, when automating safari on ios, you can go in and out of native when dealing with alerts, https warnings and so forth
ss: let's discuss the wording once we get the formal spec started. That wat we have time to think about wording.
jl: I don't care, let's have this conversation once we get started.
jl: how about testing pinch and zoom
db: we use firefox's gallery app
jl: we could do that too, probably. The advantage of this is we get to put metadata we can assert upon
fr: we can also do things like crashes
jl: oh, we need to standardize crashes
db: we don't do that in desktop. If the browser crashes, the selenium end point goes down with it. I wouldn't worry much about these edge cases
<side discussion about what tools can do on crashes>
ss: seems like an edge case scenario. Not worth putting in a spec
all: agreed
jl: ok, let's consider Titanium for creating a cross platform native gesture recognition app.
fr: suggests to log out the gestures to have the ability to verify them.
<discussion on whether a web version is enough or not for testing gestures>
ss: what do you guys do for firefox os and the current implementation of this api?
db: for scrolling, we test where we are, for example
jl: I do have the api app for android. Maybe we can use that one.
db: it will be a collaboration, I'm happy to lead it
dd: should we put this in the selenium repo?
all: we'll put it in a separate repo. Apps, suite, apis
jl: selenium is an org, we should put the new repo there
fr: selenium is for web, this isn't web. We don't want to confuse users
ss: it's just a github org, I don't think it will be very confusing
db: we're changing selenium to be the platform for testing in general, not the web anymore
ss: new repos under the selenium org it is then.
all: ok
db: let's talk to simon on his thoughts. github vs google code.
ss: ok, then let's not talk to simon :P
<rant on github vs google code>
<rant on vmware vs kvm>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment