Research Platform Services Blog

  • Archive
  • RSS
  • Got a question? Let's talk.

Halfway through my PhD candidature in linguistics at Melbourne Uni, I was introduced by Fiona to the ResPlat family. One of their aims, I was told, was to train researchers across the university in emerging tools and methods for doing better, more reproducible research. A specific target of this agenda was the Humanities and Social Sciences, who, let’s admit, sometimes lag behind a little when it comes to engagement with digital tools and methods.

image

My thesis was about corpus linguistics—that is, using computers to locate patterns in large collections of written text. Because of this, Fiona asked me if I could come on board and help out, teaching Python to researchers around the university, but with extra focus on those from the humanities. A key issue among corpus linguists, however, is that many don’t really know how to code. A more common workflow is to load text files into graphical tools, which provide interesting, but in many senses limited, windows into natural language data. The expertise is more in the interpretation of results than in the generation of them.

My confession is that at the time, this was me. I ran decade-old software, and pressed the ‘Keywords’ button to get a list of words that were 'key’ in the texts. I described and tried to explain the meaning behind whatever output the tool gave me—but the process was leaving me with doubts. When there were problems, could I fix them at their source? If someone gave me a new set of texts, or if I updated the old set, would I have to start all all over again? Was what I doing transparent and reproducible? And though it was all very interesting, was I really doing research that I could respect?

Regardless of how things were going in Thesisland, with only the most basic knowledge of shell scripting and Python under my belt, in December of 2014, I was invited on board with ResPlat, and was quickly apprenticed (read: hazed) in. While being an instructor was a key part of the job, really, I was a student at the same time. We were running the first ResBaz in February, and I was supposed to revise the course materials for “Text analysis with Python”. Uh oh.

First #HackyHour of the year, snacks courtesy of @OKFNau pic.twitter.com/mhtcM5BskU

— Fiona Tweedie (@FCTweedie)
January 15, 2015
“Hacky Hour/Frantic ResBaz Preparation”

Over the summer, I learned and practiced Python in the Jupyter Notebook, and put what I learned right into our lesson materials. It was a beginners’ guide in more than one sense. Lachlan taught me Git with patience and mercy, so that our emerging materials stayed open-source and under version control.

As I learned, it became obvious how I could apply the code to my thesis research. So, I did. I started writing code that could extract the most common nouns from my dataset. Then, I wrote code that counted the number of imperatives. Before long, I was writing a Python module for getting texts annotated with grammatical features, for searching those annotated texts, and for visualising the results. An early version of the module was used during ResBaz, to show how you can progress from a series of text files to an analysis of meaning and pragmatics in Australian political discourse. Today, use of the tool is becoming more widespread. It bridges the divide between corpus and computational linguistics, and addresses some of the misgivings I had about the the methodology of my thesis.

Seems like #challengeaccepted is very quickly becoming our new #ResBaz mantra! Cheers @About_Memory!

— Research Bazaar (@ResBaz)
February 15, 2015

Because of ResBaz, my research improved, and I honed in on what it is that I really enjoy doing. I also learned the terrifying art of teaching while live-coding—a skill that comes in handy all the time, both for teaching and for conference talks. By submission time, the code was, in my eyes, a key contribution of my work. Shortly after, a live demonstration of the module helped me land a postdoc position at the University of Tübingen, working within the European CLARIN (Common Language Resources and Technology Infrastructure) project. Like ResBaz, CLARIN aims to provide researchers, especially from the humanities and social sciences, with access and training in the use of digital resources that underpin more and more modern research. ResBaz showed me not only how important this aim is, but how much fun it can be to work toward. More specifically, my role will involve developing software and creating exemplar projects, using languages (German, Java) that I’m far from fluent in. No worries—ResBaz, via Jee, taught me to say “Challenge accepted”.

ResBaz Germany, Summer 2017. You heard it here first.

Daniel

    • #daniel
    • #resbaz
    • #nltk
    • #guest
  • 2 years ago
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Data, Data Everywhere…

by Ewan Nurse

Throughout the modern scientific community many competitive grants and publications require open publication of data. This is good for science: new methods can easily be tested on other researchers’ data helping us with our own work, and studies can be more easily replicated, verified and (fingers crossed) expanded upon. Unfortunately, more data in our research leads to more processing time, thus making some studies intractable (computer scientist talk for “it takes a really really long time”) on a standard desktop computer.

image

So what can we do to speed up our processing time?

Parallel Computing on a MatLab Distributed Computer Server

Let me introduce you to your new best friend and data cruncher extraordinaire: parallel computing on the MatLab Distributed Computing Server (MDCS): a computer server to send your parallel MatLab computing jobs. 

Normally when you run a job in MatLab, your computer runs everything in series, it just reads through your code and does only one thing at a time. Parallel computing works by splitting your task into lots of smaller tasks and sending each of the smaller tasks to a different worker. When they’re done, they all get sent back to your terminal.

image

http://www.mathworks.com/cmsimages/63635_wl_91710v00_po_fig2_wl.gif

In the following example, for each iteration of the loop a different row from DataMatrix is sent off to MyFunction for analysis and then stored in Output, one at a time:

for i = 1:10

Output(i) = MyFunction(DataMatrix(:,i));

end

Using parallel processing, our code looks like this:

parfor i = 1:10

Output(i) = MyFunction(DataMatrix(:,i));

end

The difference here is subtle but very important. I’ve used the workhorse of MatLab’s parallel processing toolbox, the parfor loop. This is a lot like a standard for loop, but instead sends each iteration of the loop out to a different worker, so it’s working on multiple iterations of the loop at once.

You can use parfor on any machine with the parallel processing toolbox, although the number of cores is likely to be low (my laptop has 2 cores, my university desktop has 4). Using the new MDCS at UniMelb, we now have access to 256 cores! This means your code can basically run 256 iterations of a for loop at once, instead of waiting for them to run in series.

image

http://www.reactiongifs.com/r/2013/06/Mother-of_God.gif

I know, it’s pretty cool. But, there is a little pain that comes with this gain:

  1. Each loop iteration can’t depend on another iteration. Each iteration effectively goes to a different computer and can’t communicate with one another, so you have to make sure each time the loop runs, it doesn’t need information from another iteration.
  2. There is what MatLab calls a ‘communication overhead’ – it takes some time to send the task from your terminal to the cores and back again, so it’s not always going to save time if you’re constantly sending small jobs back and forth.
  3. MatLab can have some trouble identifying variables as they’re passed to and from cores

Because of these issues, it can take a bit of work and reading through the MatLab documentation (http://au.mathworks.com/help/distcomp/index.html) to get your code ‘parallelised’, but the computation time saved is absolutely worth it.

image

https://s-media-cache-ak0.pinimg.com/originals/e5/91/ed/e591ed7a8f94dfb1a2de49bc7048491b.gif

Time for an Example:

Philippa Karoly (ResBaz MATLAB helper extroadinare) and I published a paper in PLoS one that was greatly helped by parallel processing. If you’re super keen, the full article can be found here: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0131328

The basic purpose of the paper was to use a machine learning algorithm, neural networks (multi-layer perceptrons for the experts playing at home), to detect if someone was squeezing their left or right hand (or not squeezing at all) from their brain activity alone. A big problem with neural networks is figuring out how many neurons you want the algorithm to use. To help us with this we initialised many networks with different numbers of neurons, trained them all, and then selected the best one.

Because each network is independent we were able to train the networks using parallel processing, sending a different neural network out to each core. This meant instead of each one waiting in line for the next iteration of the loop, we could train more than one at a time. Then, when they had all come back, we could compare them and keep the best network. This meant we could get results much faster than using a standard for loop.

What are your parallel processing success stories? Share them by emailing research.bazaar@gmail.com! Having trouble with parallel processing? Come by #HackyHour every Thursday, 3pm, Tsubu Bar, with your research woes! 

    • #guest
    • #matlab
    • #ewan
    • #data
    • #parallel computing
    • #open data
    • #MDCS
  • 3 years ago
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Integrated clinical care with Mediaflux/CAReHR

How these systems are crafting better health & health services for refugees & immigrants. 

Prof Beverley-Ann Biggs and collaborators in the University of Melbourne Department of Medicine and the Royal Melbourne Hospital identified the need for a multi-site, web-based clinical information management system for patients from a refugee-like background, who often have multiple and complex health problems compounded by poor language and literacy skills. 

Arcitecta collaborated with the the University to develop the initial version, which then attracted Government funding for development of the Refugee Health Clinical Hub to assist clinicians with improving the health outcomes of recently arrived immigrants and refugees in Victoria. Important elements of this system are Arcitecta’s Mediaflux data operating system and their Clinical Audit Research electronic Health Record (CAReHR).  

image

Screenshot of the Immigrant Health Hub website

This system offers an integrated model of care by providing point-of-care decision support for clinicians, creating a database of de-identified population data to allow evaluation of service delivery, and to implement and monitor evidence based health care. It also facilitates linkages with cdmNet for easy sharing of documentation with the patient’s General Practitioner. This system is now in regular clinical use in Immigrant Health Clinics at The Royal Melbourne Hospital, The Royal Children’s Hospital, and Barwon Health. 

Following its use in immigrant health, CAReHR is now being deployed in The Royal Melbourne Hospital Victorian Infectious Disease Service Outpatient clinics to improve the clinical management of  patients with infectious diseases, including TB and HIV/AIDS, and in the Pathway to Good Health Project to improve the wellbeing of children and young people in out-of-home care. 

CAReHR provides:
- an electronic health record that can easily be configured by clinicians according to patient group and emerging clinical issues;
- computerised clinical notes, patient care summaries and pathology requests and,
- one-click creation of patient care summaries (for GPs) and patient medical records (for records management).

image

CAReHR services both clinical and research needs within the one system, removing the need to build separate systems. E.g. the moment a clinician creates a new disease template it will be available (in de-identified form) for authorised researchers to use. A major aim of implementing the system in the Royal Melbourne Hospital Refugee Health Clinic was to permit best practice evaluation and clinical research, with the ultimate goal of better health and health services for recently arrived refugees and immigrants. This aim is being achieved and extended to other health contexts.

Additional reading about this work can be found in these articles:

http://www.ipaaleadershipawards.org.au/awards/service-delivery-award/melbourne-health/

http://www.pulseitmagazine.com.au/index.php?option=com_content&view=article&id=1634:collaboration-through-technology-for-people-of-refugee-background&catid=16:australian-ehealth&Itemid=328

http://www.arcitecta.com/Products/ForeHealth   

    • #mediaflux
    • #CAReHR
    • #MDHS
    • #medicine
    • #Arcitecta
    • #guest
  • 3 years ago
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

PhD Candidate Khalid shares his #SharksDen Experience!

by Khalid Abdulla

This July, I took part in the first ever “Shark’s Den Challenge” as part of the Research Bazaar. I turned up not really knowing what to expect, except that we’d be designing a new product/service and at some point would be let loose on the 3D printers. 

Love team #radiance who are presenting self power bike lights! Go team! #sharksden pic.twitter.com/4uoLmZk3Ed

— Gil Poznanski (@koshertonystark)
July 29, 2015

On the first night we had a speed-introductions session to meet the other people taking part, and I got to meet the team I’d be working with. I decided to take on the role of CTO (Chief Technical Officer), because I’m an engineer and tinkerer at heart.

On subsequent weeks we received training on how to use the 3D printers, as well as some excellent advice on how to form/run a start-up and how to pitch a company/product. At various brainstorming sessions our team produced many weird and wonderful ideas, but by the end of the second week we had settled on the idea of producing a low-cost self-charging spoke-mounted bike light, which would offer improved visibility of cyclists from the side.

 

There are similar products available, but these tend to retail at over $100, and rely on batteries which need to be recharged after 5-10 hours of run-time. Three people on our team commute by bicycle, and liked the idea of a low-cost alternative which could be fit once, and then never need to be removed or recharged. Over the next couple of weeks we iterated various designs of the mounting brackets, and the electrical design of the coil/magnet arrangement which would allow the lights to be charged using the relative movement between the wheel and frame.

Team Radiance with their demonstration! #SharksDen #Radiance pic.twitter.com/HNBiuia8Kg

— Aliza w (@awajih08)
July 29, 2015

We managed to cobble together a (mostly) working prototype for the pitch night on the fourth week, and whilst we didn’t win, we did get some positive and constructive feedback. Being thrown together with a diverse team (in terms of backgrounds, training, interests, experience etc.) was great. As a PhD student, in my everyday work I tend to interact mostly with other PhD students, and primarily from the same department. Whilst I don’t think the idea/product we came up with was earth-shattering, I think such mixed teams are a great place for interesting ideas to start and grow.

I also liked the fact that the challenge was product-focused because of the emphasis on using the 3D-printers. Many startup challenges tend to be geared towards software or services because they have lower barriers to entry and are easier to scale. The time-frame of the challenge was very tight, at only 4 weeks. Most/all members of the team had a day-job as well as other commitments so this meant people only really had 8-10 hours to spend on the project including all brainstorming/designing/building/pitching/etc.

A big congrats to team #Radiance for getting get a working #bikelight #prototype ready at #SharksDen. Next up #MAP16! pic.twitter.com/a2Z2xDupZm

— Paul Mignone (@PJMignone)
August 3, 2015

In some ways being tight on time was good, as it forced an intense/focused effort (and was a nice contrast to my PhD in which I do a lot of thinking/simulating/optimising rather than taking more of a ‘just do it’ attitude). However, it did reduce the scope of what we considered practical as part of the challenge; as we wanted to be sure of having at least a proof-of-concept prototype for the pitch night. A longer-format version of the challenge might encourage people to dream bigger, as well as offer more of a chance to think about practical commercialisation of their research.

I feel that I have learned a lot from taking part in the challenge. In particular the different roles which need to be filled in a small startup company, the different requirements which need to be satisfied for a product to be novel and successful, and finally the importance of, and some handy hints for pitching. As a researcher it’s important to remember that doing the research is only part of the job, it is also necessary to communicate/sell the research, and I think the “pitching” training was useful for developing those skills. 

Most importantly…it was fun.

Absolutely smashing time at tonight’s #sharksden! Thanks for the fun ride! @PJMignone @awajih08 @ResPlat @ResBaz pic.twitter.com/Xsh2R7GQIb

— Vincent Khau (@thevinniek)
July 29, 2015
    • #guest
  • 3 years ago
  • 1
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

#SharksDen: A Researcher’s Perspective (Part 1)

by Gene Venables

Entering into the Shark’s Den from the research world I didn’t quite know what to expect. My hopes were to meet new people and to build my very rudimentary 3D modelling skills. As a researcher I had dabbled with 3D printing technology to make some simple equipment to help my lab work. But they were just one-off pieces for my own use and sense of accomplishment.

In the Shark’s Den I learned of a world which I hadn’t considered previously. Commercialisation; and how others are interested in my using the pieces I’ve made. By the end of the intense five week programme I pitched an idea I was proud of, and thankfully I didn’t completely flounder in my first pitch. I found the Shark’s Den to be a great way to just dip my toes in, and learn about how the incubator and start-up world works.

#resbaz #SharksDen team 1 preview: #Bespoke guitar components pic.twitter.com/dX8VCl3Whs

— Paul Mignone (@PJMignone)
July 23, 2015

I hadn’t considered it beforehand, but I came to realise that so many products in the modern world that I use, from the watch on my wrist to the smart phone in my pocket (and many the apps on it!) had gone through this pitch process. It has changed the way I think about products I use, and I’m sure it will influence ideas that I conceive in the future.

The programme was well instructed and guided throughout, and I never felt lost at sea. Over the five weeks I worked with a group of talented people from a variety of different backgrounds. I hope to foster further collaborations and I am excited to venture into deeper water with my new skills.

Gene pitches the Artistic Instrument Designing Co. concept at #SharksDen pic.twitter.com/c1ohdozsE7

— Paul Mignone (@PJMignone)
August 2, 2015

Gene Venables
Research Support Officer
Anatomy and Neuroscience
The University of Melbourne

    • #resbaz
    • #sharksden
    • #paul
    • #gene
    • #innovation
    • #startup
    • #digismith
    • #3dmed
    • #3dprinting
    • #guest
  • 3 years ago
  • 1
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Mapping the way of everything

by Hannah Oates [guest blogger]

Dear internet, 

I have always hated maps. Being a part of a fairly old-fashioned family who detest spending money on things they ‘believe’ they don’t need, means that GPS never came into my life. So we bring out the old ‘Mel-ways’ that has coffee stains and ripped out pages, and attempt to find our destination (before we realised our phones did this). So my basic experience of maps was never very exciting. 

But yesterday, my world was turned. No longer do maps literally just map a path from A-B. They can map almost anything. The CartoDB work shop program, run by Steve Bennett with the help of Fiona and Lachlan, was mind-blowing. 

Starting off with basic procedures,  we learnt how to log in, what the buttons do, very simple things. And then scaled off into another realm of difficulty, spiraling into the map-making empire. We began with Ebola. Not the happiest topic, but gave an interesting element to our maps, and had a rather curious data set. And then we learnt how to work with the data sets; downloading data sets, uploading data sets, playing around with the data sets, changing parts of the data sets, then having to change it back. It seemed to all revolve around this data set. Which is an accurate assumption, because in fact, the entire program revolves around you have a proper data set.

And with this seemingly magically set of data, comes a map. Quite simply, a fully-fledged, legitimate (depending on the data set) and much more interesting than Mel-ways, map.


image

For example, here is a map that I made, that shows every public internet access in Victoria. How cool is that? Not to mention, amazingly useful.  

And if you wanted to see the full extent of the power of this map, see the link. It shows the individual addresses of the places, whether it is free and phone contact details. http://cdb.io/1f9fCwH

Or how about every meteorite that has hit Australia? This map, created using CartoDB, by Simon Rodgers, acurately shows where meteorites have hit, from all over the world. (http://www.theguardian.com/news/datablog/interactive/2013/feb/15/meteorite-fall-map)

image

https://simonrogers.cartodb.com/maps 

No longer have maps taken a mundane toll on my life. I am definitely going to be using this site. And encourage everyone reading this to as well. 

Visit data.gov.au for pre-made data sets. 

    • #cartodb
    • #mapping
    • #guest
  • 3 years ago
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Research tools for Law

By Anna Dziedzic

Law is probably one of the most book-bound disciplines in the academy. “Reading law” is what lawyers call practical legal training. Lawyers love the idea of bookshelves filled with identically bound volumes of law reports.  Sure, legal researchers use library databases and e-journals – but more often than not, we download and print out the pdf before taking notes by hand and writing up our research for publication in hardcopy.

In short, legal research is ready for a digital revolution. After all, law is text and text is data and data can be mined, mapped and manipulated. And so enter Resbaz. Research Platforms Services, in conjunction with Melbourne Law School’s Graduate Researchers’ Association, recently ran a workshop to introduce legal researchers at MLS to a range of digital tools to support legal research endeavours.

The workshop was organised around four stages of legal research: collecting material, managing and analysing data, writing and sharing. Daniel McDonald introduced text mining for lawyers, and the potential for Python and Natural Language Toolkit to analyse large volumes of text such as judgments, legislation, or representations of law in the media. Lachlan Musicman demystified databases and Fiona Tweedie explained collaborative writing with Authorea and creating digital exhibitions using Omeka. I took the opportunity to spruik my global maps charting every national constitution ever written, created using CartoDB. Finally, Dejan Jotanovic encouraged us to stop lurking on Twitter and start communicating our research to the rest of the world.

Now @heyDejan debunks some myths of social media for researchers - check out #auslaw or #auscon! pic.twitter.com/ic6C18A64h

— Research Platforms (@ResPlat)
April 28, 2015

And the reaction from legal researchers? For some, it was the first time they were exposed to so much tech talk, and naturally it takes a while to become familiar with new languages and concepts. (Now we know how non-lawyers feel when we use incomprehensible legal jargon!) For others, this brief introduction opened up exciting new directions for legal research, or for learning new skills and doing research more efficiently.  So while nothing will replace the close reading that lies at the core of legal research, digital tools can provide legal researchers with the means to take legal research down new pathways, to collaborate across disciplines and to communicate our research to new audiences in new ways.

Anna Dziedzic is a PhD student at Melbourne Law School and a recent convert to social media and mapping.

Anna shares her research journey from word doc to interactive @cartoDB map with a little help from us! pic.twitter.com/xHAu04c3bd

— Research Platforms (@ResPlat)
April 28, 2015
    • #law
    • #nltk
    • #omeka
    • #social media
    • #cartodb
    • #authorea
    • #speed dating
    • #guest
  • 3 years ago
  • 1
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Portrait/Logo

About

Welcome to the Research Platform Services Blog. We're here to help you do your research better! We'll connect you with the best research tools, workshops, expertise & community. Need more information? Check out our pages below!

http://research.unimelb.edu.au
/infrastructure/research-platform-services

Pages

  • About us
  • Sign-up for FREE researcher training HERE
  • ResPlat Training Catalogue
  • Calendar of Events and Trainings
  • CoLab: A New Collaborative Space for Researchers!
  • Mailing List
  • The Research Bazaar 2018
  • #MyResearch Video Campaign
  • Resbook

Me, Elsewhere

  • @ResPlat on Twitter
  • ResBaz on Youtube
  • ResBaz on Flickr
  • resbaz on github
  • ResBaz on Instagram

Top

  • RSS
  • Random
  • Archive
  • Got a question? Let's talk.
  • Mobile
Effector Theme — Tumblr themes by Pixel Union